service_skeleton 0.0.0.49.g47046b9
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.editorconfig +7 -0
- data/.gitignore +9 -0
- data/.rubocop.yml +1 -0
- data/.travis.yml +11 -0
- data/.yardopts +1 -0
- data/CODE_OF_CONDUCT.md +49 -0
- data/CONTRIBUTING.md +13 -0
- data/LICENCE +674 -0
- data/README.md +767 -0
- data/lib/service_skeleton.rb +41 -0
- data/lib/service_skeleton/config.rb +133 -0
- data/lib/service_skeleton/config_class.rb +16 -0
- data/lib/service_skeleton/config_variable.rb +44 -0
- data/lib/service_skeleton/config_variable/boolean.rb +21 -0
- data/lib/service_skeleton/config_variable/enum.rb +27 -0
- data/lib/service_skeleton/config_variable/float.rb +25 -0
- data/lib/service_skeleton/config_variable/integer.rb +25 -0
- data/lib/service_skeleton/config_variable/kv_list.rb +26 -0
- data/lib/service_skeleton/config_variable/path_list.rb +13 -0
- data/lib/service_skeleton/config_variable/string.rb +18 -0
- data/lib/service_skeleton/config_variable/url.rb +36 -0
- data/lib/service_skeleton/config_variable/yaml_file.rb +42 -0
- data/lib/service_skeleton/config_variables.rb +79 -0
- data/lib/service_skeleton/error.rb +10 -0
- data/lib/service_skeleton/filtering_logger.rb +38 -0
- data/lib/service_skeleton/generator.rb +165 -0
- data/lib/service_skeleton/logging_helpers.rb +28 -0
- data/lib/service_skeleton/metric_method_name.rb +9 -0
- data/lib/service_skeleton/metrics_methods.rb +37 -0
- data/lib/service_skeleton/runner.rb +46 -0
- data/lib/service_skeleton/service_name.rb +20 -0
- data/lib/service_skeleton/signal_manager.rb +202 -0
- data/lib/service_skeleton/signals_methods.rb +15 -0
- data/lib/service_skeleton/ultravisor_children.rb +17 -0
- data/lib/service_skeleton/ultravisor_loggerstash.rb +11 -0
- data/service_skeleton.gemspec +55 -0
- metadata +356 -0
data/README.md
ADDED
@@ -0,0 +1,767 @@
|
|
1
|
+
The `ServiceSkeleton` provides the bare bones of a "service" program -- one
|
2
|
+
which is intended to be long-lived, providing some sort of functionality to
|
3
|
+
other parts of a larger system. It provides:
|
4
|
+
|
5
|
+
* A Logger, including dynamic log-level and filtering management;
|
6
|
+
* Prometheus-based metrics registry;
|
7
|
+
* Signal handling;
|
8
|
+
* Configuration extraction from the process environment;
|
9
|
+
* Supervision and automated restarting of your service code;
|
10
|
+
* and more.
|
11
|
+
|
12
|
+
The general philosophy of `ServiceSkeleton` is to provide features which have
|
13
|
+
been found to be almost universally necessary in modern deployment
|
14
|
+
configurations, to prefer convenience over configuration, and to always be
|
15
|
+
secure by default.
|
16
|
+
|
17
|
+
|
18
|
+
# Installation
|
19
|
+
|
20
|
+
It's a gem:
|
21
|
+
|
22
|
+
gem install service_skeleton
|
23
|
+
|
24
|
+
There's also the wonders of [the Gemfile](http://bundler.io):
|
25
|
+
|
26
|
+
gem 'service_skeleton'
|
27
|
+
|
28
|
+
If you're the sturdy type that likes to run from git:
|
29
|
+
|
30
|
+
rake install
|
31
|
+
|
32
|
+
Or, if you've eschewed the convenience of Rubygems entirely, then you
|
33
|
+
presumably know what to do already.
|
34
|
+
|
35
|
+
|
36
|
+
# Usage
|
37
|
+
|
38
|
+
A very minimal implementation of a service using `ServiceSkeleton`, which
|
39
|
+
simply prints "Hello, Service!" to stdout every second or so, might look
|
40
|
+
like this:
|
41
|
+
|
42
|
+
require "service_skeleton"
|
43
|
+
|
44
|
+
class HelloService
|
45
|
+
include ServiceSkeleton
|
46
|
+
|
47
|
+
def run
|
48
|
+
loop do
|
49
|
+
puts "Hello, Service!"
|
50
|
+
sleep 1
|
51
|
+
end
|
52
|
+
end
|
53
|
+
end
|
54
|
+
|
55
|
+
ServiceSkeleton::Runner.new(HelloService, ENV).run if __FILE__ == $0
|
56
|
+
|
57
|
+
First, we require the `"service_skeleton"` library, which is a pre-requisite
|
58
|
+
for the `ServiceSkeleton` module to be available. Your code is placed in
|
59
|
+
its own class in the `run` method, where you put your service's logic. The
|
60
|
+
`ServiceSkeleton` module provides helper methods and initializers, which will
|
61
|
+
be introduced as we go along.
|
62
|
+
|
63
|
+
The `run` method is typically an infinite loop, because services are long-running,
|
64
|
+
persistent processes. If you `run` method exits, or raises an unhandled exception,
|
65
|
+
the supervisor will restart it.
|
66
|
+
|
67
|
+
Finally, the last line uses the `ServiceSkeleton::Runner` class to actually run
|
68
|
+
your service. This ensures that all of the scaffolding services, like the
|
69
|
+
signal handler and metrics server, are up and running alongside your service
|
70
|
+
code.
|
71
|
+
|
72
|
+
|
73
|
+
## The `#run` loop
|
74
|
+
|
75
|
+
The core of a service is usually some sort of infinite loop, which waits for a
|
76
|
+
reason to do something, and then does it. A lot of services are network
|
77
|
+
accessible, and so the "reason to do something" is "because someone made a
|
78
|
+
connection to a port on which I'm listening". Other times it could be because
|
79
|
+
of a periodic timer firing, a filesystem event, or anything else that takes
|
80
|
+
your fancy.
|
81
|
+
|
82
|
+
Whatever it is, `ServiceSkeleton` doesn't discriminate. All you have to do is
|
83
|
+
write it in your service class' `#run` method, and we'll take care of the rest.
|
84
|
+
|
85
|
+
|
86
|
+
### STAHP!
|
87
|
+
|
88
|
+
When your service needs to be stopped for one reason or another, `ServiceSkeleton`
|
89
|
+
needs to be able to tell your code to stop. By default, the thread that is
|
90
|
+
running your service will just be killed, which might be fine if your service
|
91
|
+
holds no state or persistent resources, but often that isn't the case.
|
92
|
+
|
93
|
+
If your code needs to stop gracefully, you should define a (thread-safe)
|
94
|
+
instance method, `#shutdown`, which does whatever is required to signal to
|
95
|
+
your service worker code that it is time to return from the `#run` method.
|
96
|
+
What that does, exactly, is up to you.
|
97
|
+
|
98
|
+
```
|
99
|
+
class CustomShutdownService
|
100
|
+
include ServiceSkeleton
|
101
|
+
|
102
|
+
def run
|
103
|
+
until @shutdown do
|
104
|
+
puts "Hello, Service!"
|
105
|
+
sleep 1
|
106
|
+
end
|
107
|
+
|
108
|
+
puts "Shutting down gracefully..."
|
109
|
+
end
|
110
|
+
|
111
|
+
def shutdown
|
112
|
+
@shutdown = true
|
113
|
+
end
|
114
|
+
end
|
115
|
+
```
|
116
|
+
|
117
|
+
To avoid the unpleasantness of a hung service, there is a limit on the amount
|
118
|
+
of time that `ServiceSkeleton` will wait for your service code to terminate.
|
119
|
+
This is, by default, five seconds, but you can modify that by defining a
|
120
|
+
`#shutdown_timeout` method, which returns a `Numeric`, to specify the number of
|
121
|
+
seconds that `ServiceSkeleton` should wait for termination.
|
122
|
+
|
123
|
+
```
|
124
|
+
class SlowShutdownService
|
125
|
+
include ServiceSkeleton
|
126
|
+
|
127
|
+
def run
|
128
|
+
until @shutdown do
|
129
|
+
puts "Hello, Service!"
|
130
|
+
sleep 60
|
131
|
+
end
|
132
|
+
end
|
133
|
+
|
134
|
+
def shutdown
|
135
|
+
@shutdown = true
|
136
|
+
end
|
137
|
+
|
138
|
+
def shutdown_timeout
|
139
|
+
# We need an unusually long shutdown timeout for this service because
|
140
|
+
# the shutdown flag is only checked once a minute, which is much longer
|
141
|
+
# than the default shutdown period.
|
142
|
+
90
|
143
|
+
end
|
144
|
+
end
|
145
|
+
```
|
146
|
+
|
147
|
+
If your service code does not terminate before the timeout, the thread will be,
|
148
|
+
once again, unceremoniously killed.
|
149
|
+
|
150
|
+
|
151
|
+
### Exceptional Behaviour
|
152
|
+
|
153
|
+
If your `#run` loop happens to raise an unhandled exception, it will be caught,
|
154
|
+
logged, and your service will be restarted. This involves instantiating a new
|
155
|
+
instance of your service class, and calling `#run` again.
|
156
|
+
|
157
|
+
In the event that the problem that caused the exception isn't transient, and
|
158
|
+
your service code keeps exiting (either by raising an exception, or the `#run`
|
159
|
+
method returning), the supervisor will, after a couple of retries, terminate
|
160
|
+
the whole process.
|
161
|
+
|
162
|
+
This allows for a *really* clean slate restart, by starting a whole new
|
163
|
+
process. Your process manager should handle automatically restarting the
|
164
|
+
process in a sensible manner.
|
165
|
+
|
166
|
+
|
167
|
+
## The Service Name
|
168
|
+
|
169
|
+
Several aspects of a `ServiceSkeleton` service, including environment variable
|
170
|
+
and metric names, can incorporate the service's name, usually as a prefix. The
|
171
|
+
service name is derived from the name of the class that you provide to
|
172
|
+
`ServiceSkeleton::Runner.new`, by converting the `CamelCase` class name into a
|
173
|
+
`snake_case` service name. If the class name is in a namespace, that is
|
174
|
+
included also, with the `::` turned into `_`.
|
175
|
+
|
176
|
+
|
177
|
+
## Configuration
|
178
|
+
|
179
|
+
Almost every service has a need for some amount of configuration. In keeping
|
180
|
+
with the general principles of the [12 factor app](https://12factor.net),
|
181
|
+
`ServiceSkeleton` takes configuration from the environment. However, we try to
|
182
|
+
minimise the amount of manual effort you need to expend to make that happen,
|
183
|
+
and provide configuration management as a first-class operation.
|
184
|
+
|
185
|
+
|
186
|
+
### Basic Configuration
|
187
|
+
|
188
|
+
The `ServiceSkeleton` module defines an instance method, called `#config`, which
|
189
|
+
returns an instance of {ServiceSkeleton::Config} (or some other class you
|
190
|
+
specify; more on that below), which provides access to the environment that was
|
191
|
+
passed into the service object at instantiation time (ie the `ENV` in
|
192
|
+
`ServiceSkeleton.new(MyService, ENV)`) via the `#[]` method. So, in a very simple
|
193
|
+
application where you want to get the name of the thing to say hello to, it
|
194
|
+
might look like this:
|
195
|
+
|
196
|
+
class GenericHelloService
|
197
|
+
include ServiceSkeleton
|
198
|
+
|
199
|
+
def run
|
200
|
+
loop do
|
201
|
+
puts "Hello, #{config["RECIPIENT"]}!"
|
202
|
+
sleep 1
|
203
|
+
end
|
204
|
+
end
|
205
|
+
end
|
206
|
+
|
207
|
+
ServiceSkeleton::Runner.new(GenericHelloService, "RECIPIENT" => "Bob").start
|
208
|
+
|
209
|
+
This will print "Hello, Bob!" every second.
|
210
|
+
|
211
|
+
|
212
|
+
### Declaring Configuration Variables
|
213
|
+
|
214
|
+
If your application has very minimal needs, it's possible that directly
|
215
|
+
accessing the environment will be sufficient. However, you can (and usually
|
216
|
+
should) declare your configuration variables in your service class, because
|
217
|
+
that way you can get coerced values (numbers, booleans, lists, etc, rather than
|
218
|
+
just plain strings), range and format checking (say "the number must be an
|
219
|
+
integer between one and ten", or "the string must match this regex"), default
|
220
|
+
values, and error reporting. You also get direct access to the configuration
|
221
|
+
value as a method call on the `config` object.
|
222
|
+
|
223
|
+
To declare configuration variables, simply call one of the "config declaration
|
224
|
+
methods" (as listed in the `ServiceSkeleton::ConfigVariables` module) in your
|
225
|
+
class definition, and pass it an environment variable name (as a string or
|
226
|
+
symbol) and any relevant configuration parameters (like a default, or a
|
227
|
+
validity range, or whatever).
|
228
|
+
|
229
|
+
When you run your service (via {ServiceSkeleton::Runner#new}), the environment
|
230
|
+
you pass in will be examined and the configuration initialised. If any values
|
231
|
+
are invalid (number out of range, etc) or missing (for any configuration
|
232
|
+
variable that doesn't have a default), then a
|
233
|
+
{ServiceSkeleton::InvalidEnvironmentError} exception will be raised and the
|
234
|
+
service will not start.
|
235
|
+
|
236
|
+
During your service's execution, any time you need to access a configuration
|
237
|
+
value, just call the matching method name (the all-lowercase version of the
|
238
|
+
environment variable name, without the service name prefix) on `config`, and
|
239
|
+
you'll get the value in your lap.
|
240
|
+
|
241
|
+
Here's a version of our generic greeter service, using declared configuration
|
242
|
+
variables:
|
243
|
+
|
244
|
+
class GenericHelloService
|
245
|
+
include ServiceSkeleton
|
246
|
+
|
247
|
+
string :RECIPIENT, matches: /\A\w+\z/
|
248
|
+
|
249
|
+
def run
|
250
|
+
loop do
|
251
|
+
puts "Hello, #{config.recipient}!"
|
252
|
+
sleep 1
|
253
|
+
end
|
254
|
+
end
|
255
|
+
end
|
256
|
+
|
257
|
+
begin
|
258
|
+
ServiceSkeleton::Runner.new(GenericHelloService, ENV).run
|
259
|
+
rescue ServiceSkeleton::InvalidEnvironmentError => ex
|
260
|
+
$stderr.puts "Configuration error found: #{ex.message}"
|
261
|
+
exit 1
|
262
|
+
end
|
263
|
+
|
264
|
+
This service, if run without a `RECIPIENT` environment variable being available,
|
265
|
+
will exit with an error. If that isn't what you want, you can declare a
|
266
|
+
default for a config variable, like so:
|
267
|
+
|
268
|
+
class GenericHelloService
|
269
|
+
include ServiceSkeleton
|
270
|
+
|
271
|
+
string :RECIPIENT, matches: /\a\w+\z/, default: "Anonymous Coward"
|
272
|
+
|
273
|
+
# ...
|
274
|
+
|
275
|
+
*This* version will print "Hello, Anonymous Coward!" if no `RECIPIENT`
|
276
|
+
environment variable is available.
|
277
|
+
|
278
|
+
|
279
|
+
### Environment Variable Prefixes
|
280
|
+
|
281
|
+
It's common for all (or almost all) of your environment variables to have a
|
282
|
+
common prefix, usually named for your service, to distinguish your service's
|
283
|
+
configuration from any other environment variables lying around. However, to
|
284
|
+
save on typing, you don't want to have to use that prefix when accessing your
|
285
|
+
`config` methods.
|
286
|
+
|
287
|
+
Enter: the service name prefix. Any of your environment variables whose name
|
288
|
+
starts with [your service's name](#the-service-name) (matched
|
289
|
+
case-insensitively) followed by an underscore will have that part of the
|
290
|
+
environment variable name removed to determine the method name on `config`.
|
291
|
+
The *original* environment variable name is still matched to a variable
|
292
|
+
declaration, so, you need to declare the variable *with* the prefix, it is only
|
293
|
+
the method name on the `config` object that won't have the prefix.
|
294
|
+
|
295
|
+
Using this environment variable prefix support, the `GenericHelloService` would
|
296
|
+
have a (case-insensitive) prefix of `generic_hello_service_`. In that case,
|
297
|
+
extending the above example a little more, you could do something like this:
|
298
|
+
|
299
|
+
class GenericHelloService
|
300
|
+
include ServiceSkeleton
|
301
|
+
|
302
|
+
string :GENERIC_HELLO_SERVICE_RECIPIENT, matches: /\A\w+\z/
|
303
|
+
|
304
|
+
def run
|
305
|
+
loop do
|
306
|
+
puts "Hello, #{config.recipient}!"
|
307
|
+
sleep 1
|
308
|
+
end
|
309
|
+
end
|
310
|
+
end
|
311
|
+
|
312
|
+
Then, if the environment contained `GENERIC_HELLO_SERVICE_RECIPIENT`, its value
|
313
|
+
would be accessible via `config.recipient` in the program.
|
314
|
+
|
315
|
+
|
316
|
+
### Sensitive environment variables
|
317
|
+
|
318
|
+
Sometimes your service will take configuration data that really, *really*
|
319
|
+
shouldn't be available to subprocesses or anyone who manages to catch a
|
320
|
+
sneak-peek at your service's environment. In that case, you can declare an
|
321
|
+
environment variable as "sensitive", and after the configuration is parsed,
|
322
|
+
that environment variable will be redacted from the environment.
|
323
|
+
|
324
|
+
To declare an environment variable as "sensitive", simply pass the `sensitive`
|
325
|
+
parameter, with a trueish value, to the variable declaration in your class:
|
326
|
+
|
327
|
+
class DatabaseManager
|
328
|
+
include ServiceSkeleton
|
329
|
+
|
330
|
+
string :DB_PASSWORD, sensitive: true
|
331
|
+
|
332
|
+
...
|
333
|
+
end
|
334
|
+
|
335
|
+
> **NOTE**: The process environment can only be modified if you pass the real,
|
336
|
+
> honest-to-goodness `ENV` object into `MyServiceClass.new(ENV)`. If you
|
337
|
+
> provide a copy of `ENV`, or some other hash entirely, that'll work if you
|
338
|
+
> don't have any sensitive variables declared, but the moment you declare a
|
339
|
+
> sensitive variable, passing in any hash other than `ENV` will cause the
|
340
|
+
> service to log an error and refuse to start. This avoids the problems of
|
341
|
+
> accidentally modifying global state if that would be potentially bad (we
|
342
|
+
> assume you copied `ENV` for a reason) without leaving a gaping security hole
|
343
|
+
> (sensitive data blindly passed into subprocesses that you didn't expect).
|
344
|
+
|
345
|
+
|
346
|
+
### Using a Custom Configuration Class
|
347
|
+
|
348
|
+
Whilst we hope that {ServiceSkeleton::Config} will be useful in most
|
349
|
+
situations, there are undoubtedly cases where the config management we provide
|
350
|
+
won't be enough. In that case, you are encouraged to subclass
|
351
|
+
`ServiceSkeleton::Config` and augment the standard interface with your own
|
352
|
+
implementations (remembering to call `super` where appropriate), and tell
|
353
|
+
`ServiceSkeleton` to use your implementation by calling the `.config_class`
|
354
|
+
class method in your service's class definition, like this:
|
355
|
+
|
356
|
+
class MyServiceConfig < ServiceSkeleton::Config
|
357
|
+
attr_reader :something_funny
|
358
|
+
|
359
|
+
def initialize(env)
|
360
|
+
@something_funny = "flibbety gibbets"
|
361
|
+
end
|
362
|
+
end
|
363
|
+
|
364
|
+
class MyService
|
365
|
+
include ServiceSkeleton
|
366
|
+
|
367
|
+
config_class MyServiceConfig
|
368
|
+
|
369
|
+
def run
|
370
|
+
loop do
|
371
|
+
puts config.something_funny
|
372
|
+
sleep 1
|
373
|
+
end
|
374
|
+
end
|
375
|
+
end
|
376
|
+
|
377
|
+
|
378
|
+
## Logging
|
379
|
+
|
380
|
+
You can't have a good service without good logging. Therefore, the
|
381
|
+
`ServiceSkeleton` does its best to provide a sensible logging implementation
|
382
|
+
for you to use.
|
383
|
+
|
384
|
+
|
385
|
+
### What You Get
|
386
|
+
|
387
|
+
Every instance of your service class has a method named, uncreatively,
|
388
|
+
`logger`. It is a (more-or-less) straight-up instance of the Ruby stdlib
|
389
|
+
`Logger`, on which you can call all the usual methods (`#debug`, `#info`,
|
390
|
+
`#warn`, `#error`, etc). By default, it sends all log messages to standard
|
391
|
+
error.
|
392
|
+
|
393
|
+
When calling the logger, you really, *really* want to use the
|
394
|
+
"progname+message-in-a-block" style of recording log messages, which looks like
|
395
|
+
this:
|
396
|
+
|
397
|
+
logger.debug("lolrus") { "Something funny!" }
|
398
|
+
|
399
|
+
In addition to the potential performance benefits, the `ServiceSkeleton` logger
|
400
|
+
provides the ability to filter on the progname passed to each log message call.
|
401
|
+
That means that you can put in *lots* of debug logging (which is always a good
|
402
|
+
idea), and then turn on debug logging *only* for the part of the system you
|
403
|
+
wish to actively debug, based on log messages that are tagged with a specified
|
404
|
+
progname. No more grovelling through thousands of lines of debug logging to
|
405
|
+
find the One Useful Message.
|
406
|
+
|
407
|
+
You also get, as part of this package, built-in dynamic log level adjustment;
|
408
|
+
using Unix signals or the admin HTTP interface (if enabled), you can tell the
|
409
|
+
logger to increase or decrease logging verbosity *without interrupting
|
410
|
+
service*. We are truly living in the future.
|
411
|
+
|
412
|
+
Finally, if you're a devotee of the ELK stack, the logger can automagically
|
413
|
+
send log entries straight into logstash, rather than you having to do it in
|
414
|
+
some more roundabout fashion.
|
415
|
+
|
416
|
+
|
417
|
+
### Logging Configuration
|
418
|
+
|
419
|
+
The logger automatically sets its configuration from, you guessed it, the
|
420
|
+
environment. The following environment variables are recognised by the logger.
|
421
|
+
All environment variable names are all-uppercase, and the `<SERVICENAME>_`
|
422
|
+
portion is the all-uppercase [service name](#the-service-name).
|
423
|
+
|
424
|
+
* **`<SERVICENAME>_LOG_LEVEL`** (default: `"INFO"`) -- the minimum severity of
|
425
|
+
log messages which will be emitted by the logger.
|
426
|
+
|
427
|
+
The simple form of this setting is just a severity name: one of `DEBUG`,
|
428
|
+
`INFO`, `WARN`, `ERROR`, or `FATAL` (case-insensitive). This sets the
|
429
|
+
severity threshold for all log messages in the entire service.
|
430
|
+
|
431
|
+
If you wish to change the severity level for a single progname, you can
|
432
|
+
override the default log level for messages with a specific progname, by
|
433
|
+
specifying one or more "progname/severity" pairs, separated by commas. A
|
434
|
+
progname/severity pair looks like this:
|
435
|
+
|
436
|
+
<progname>=<severity>
|
437
|
+
|
438
|
+
To make things even more fun, if `<progname>` looks like a regular expression
|
439
|
+
(starts with `/` or `%r{`, and ends with `/` or `}` plus optional flag
|
440
|
+
characters), then all log messages with prognames *matching* the specified
|
441
|
+
regex will have that severity applied. First match wins. The default is
|
442
|
+
still specified as a bare severity name, and the default can only be set
|
443
|
+
once.
|
444
|
+
|
445
|
+
That's a lot to take in, so here's an example which sets the default to
|
446
|
+
`INFO`, debugs the `buggy` progname, and only emits errors for messages with
|
447
|
+
the (case-insensitive) string `noisy` in their progname:
|
448
|
+
|
449
|
+
INFO,buggy=DEBUG,/noisy/i=ERROR
|
450
|
+
|
451
|
+
Logging levels can be changed at runtime, via [signals](#default-signals) or
|
452
|
+
[the HTTP admin interface](#http-admin-interface).
|
453
|
+
|
454
|
+
* **`<SERVICENAME>_LOGSTASH_SERVER`** (string; default `""`) -- if set to a
|
455
|
+
non-empty string, we will engage the services of the [loggerstash
|
456
|
+
gem](https://github.com/discourse/loggerstash) on your behalf to send all log
|
457
|
+
entries to the logstash server you specify (as [an `address:port`,
|
458
|
+
`hostname:port`, or SRV
|
459
|
+
record](https://github.com/discourse/logstash_writer#usage). Just be sure
|
460
|
+
and [configure logstash
|
461
|
+
appropriately](https://github.com/discourse/loggerstash#logstash-configuration).
|
462
|
+
|
463
|
+
* **`<SERVICENAME>_LOG_ENABLE_TIMESTAMPS`** (boolean; default: `"no"`) -- if
|
464
|
+
set to a true-ish value (`yes`/`y`/`on`/`true`/`1`), then the log entries
|
465
|
+
emitted by the logger will have the current time (to the nearest nanosecond)
|
466
|
+
prefixed to them, in RFC3339 format
|
467
|
+
(`<YYYY>-<mm>-<dd>T<HH>:<MM>:<SS>.<nnnnnnnnn>Z`). By default, it is assumed
|
468
|
+
that services are run through a supervisor system of some sort, which
|
469
|
+
captures log messages and timestamps them, but if you are in a situation
|
470
|
+
where log messages aren't automatically timestamped, then you can use this to
|
471
|
+
get them back.
|
472
|
+
|
473
|
+
* **`<SERVICENAME>_LOG_FILE`** (string; default: `"/dev/stderr"`) -- the file
|
474
|
+
to which log messages are written. The default, to send messages to standard
|
475
|
+
error, is a good choice if you are using a supervisor system which captures
|
476
|
+
service output to its own logging system, however if you are stuck without
|
477
|
+
such niceties, you can specify a file on disk to log to instead.
|
478
|
+
|
479
|
+
* **`<SERVICENAME>_LOG_MAX_FILE_SIZE`** (integer; range 0..Inf; default:
|
480
|
+
`"1048576"`) -- if you are logging to a file on disk, you should limit the
|
481
|
+
size of each log file written to prevent disk space exhaustion. This
|
482
|
+
configuration variable specifies the maximum size of any one log file, in
|
483
|
+
bytes. Once the log file exceeds the specified size, it is renamed to
|
484
|
+
`<filename>.0`, and a new log file started.
|
485
|
+
|
486
|
+
If, for some wild reason, you don't wish to limit your log file sizes, you
|
487
|
+
can set this environment variable to `"0"`, in which case log files will
|
488
|
+
never be automatically rotated. In that case, you are solely responsible for
|
489
|
+
rotation and log file management, and [the `SIGHUP` signal](#default-signals)
|
490
|
+
will likely be of interest to you.
|
491
|
+
|
492
|
+
* **`<SERVICENAME>_LOG_MAX_FILES`** (integer; range 1..Inf; default: `"3"`) --
|
493
|
+
if you are logging to a file on disk, you should limit the number of log
|
494
|
+
files kept to prevent disk space exhaustion. This configuration variable
|
495
|
+
specifies the maximum number of log files to keep (including the log file
|
496
|
+
currently being written to). As log files reach `LOG_MAX_FILE_SIZE`, they
|
497
|
+
are rotated out, and older files are renamed with successively higher numeric
|
498
|
+
suffixes. Once there are more than `LOG_MAX_FILES` on disk, the oldest file
|
499
|
+
is deleted to keep disk space under control.
|
500
|
+
|
501
|
+
Using this "file size+file count" log file management method, your logs will
|
502
|
+
only ever consume about `LOG_MAX_FILES*LOG_MAX_FILE_SIZE` bytes of disk
|
503
|
+
space.
|
504
|
+
|
505
|
+
|
506
|
+
## Metrics
|
507
|
+
|
508
|
+
Running a service without metrics is like trying to fly a fighter jet whilst
|
509
|
+
blindfolded: everything seems to be going OK until you slam into the side of a
|
510
|
+
mountain you never saw coming. For that reason, `ServiceSkeleton` provides a
|
511
|
+
Prometheus-based metrics registry, a bunch of default process-level metrics, an
|
512
|
+
optional HTTP metrics server, and simple integration with [the Prometheus ruby
|
513
|
+
client library](https://rubygems.org/gems/prometheus-client) and [the
|
514
|
+
Frankenstein library](https://rubygems.org/gems/frankenstein) to make it as
|
515
|
+
easy as possible to instrument the heck out of your service.
|
516
|
+
|
517
|
+
|
518
|
+
### Defining and Using Metrics
|
519
|
+
|
520
|
+
All the metrics you want to use within your service need to be registered
|
521
|
+
before use. This is done via class methods, similar to declaring environment
|
522
|
+
variables.
|
523
|
+
|
524
|
+
To register a metric, use one of the standard metric registration methods from
|
525
|
+
[Prometheus::Client::Registry](https://www.rubydoc.info/gems/prometheus-client/0.8.0/Prometheus/Client/Registry)
|
526
|
+
(`counter`, `gauge`, `histogram`, `summary`) or `metric` (equivalent
|
527
|
+
to the `register` method of `Prometheus::Client::Registry) in your class
|
528
|
+
definition to register the metric for use.
|
529
|
+
|
530
|
+
In our generic greeter service we've been using as an example so far, you might
|
531
|
+
like to define a metric to count how many greetings have been sent. You'd define
|
532
|
+
such a metric like this:
|
533
|
+
|
534
|
+
class GenericHelloService
|
535
|
+
include ServiceSkeleton
|
536
|
+
|
537
|
+
string :GENERIC_HELLO_SERVICE_RECIPIENT, matches: /\A\w+\z/
|
538
|
+
|
539
|
+
counter :greetings_total, docstring: "How many greetings we have sent", labels: %i{recipient}
|
540
|
+
|
541
|
+
# ...
|
542
|
+
|
543
|
+
When it comes time to actually *use* the metrics you have created, you access
|
544
|
+
them as methods on the `metrics` method in your service worker instance. Thus,
|
545
|
+
to increment our greeting counter, you simply do:
|
546
|
+
|
547
|
+
class GenericHelloService
|
548
|
+
include ServiceSkeleton
|
549
|
+
|
550
|
+
string :GENERIC_HELLO_SERVICE_RECIPIENT, matches: /\A\w+\z/
|
551
|
+
|
552
|
+
counter :greetings_total, docstring: "How many greetings we have sent", labels: %i{recipient}
|
553
|
+
|
554
|
+
def run
|
555
|
+
loop do
|
556
|
+
puts "Hello, #{config.recipient}!"
|
557
|
+
metrics.greetings_total.increment(labels: { recipient: config.recipient })
|
558
|
+
sleep 1
|
559
|
+
end
|
560
|
+
end
|
561
|
+
end
|
562
|
+
|
563
|
+
As a bonus, because metric names are typically prefixed with the service name,
|
564
|
+
any metrics you define which have the [service name](#the-service-name) as a
|
565
|
+
prefix will have that prefix (and the immediately-subsequent underscore) removed
|
566
|
+
before defining the metric accessor method, which keeps typing to a minimum:
|
567
|
+
|
568
|
+
class GenericHelloService
|
569
|
+
include ServiceSkeleton
|
570
|
+
|
571
|
+
string :GENERIC_HELLO_SERVICE_RECIPIENT, matches: /\A\w+\z/
|
572
|
+
|
573
|
+
counter :generic_hello_service_greetings_total, docstring: "How many greetings we have sent", labels: %i{recipient}
|
574
|
+
|
575
|
+
def run
|
576
|
+
loop do
|
577
|
+
puts "Hello, #{config.recipient}!"
|
578
|
+
metrics.greetings_total.increment(labels: { recipient: config.recipient })
|
579
|
+
sleep 1
|
580
|
+
end
|
581
|
+
end
|
582
|
+
end
|
583
|
+
|
584
|
+
|
585
|
+
### Default Metrics
|
586
|
+
|
587
|
+
[Recommended
|
588
|
+
practice](https://prometheus.io/docs/instrumenting/writing_clientlibs/#standard-and-runtime-collectors)
|
589
|
+
is for collectors to provide a bunch of standard metrics, and `ServiceSkeleton`
|
590
|
+
never met a recommended practice it didn't like. So, we provide [process
|
591
|
+
metrics](https://www.rubydoc.info/gems/frankenstein/Frankenstein/ProcessMetrics),
|
592
|
+
[Ruby GC
|
593
|
+
metrics](https://www.rubydoc.info/gems/frankenstein/Frankenstein/RubyGCMetrics),
|
594
|
+
and [Ruby VM
|
595
|
+
metrics](https://www.rubydoc.info/gems/frankenstein/Frankenstein/RubyVMMetrics).
|
596
|
+
|
597
|
+
|
598
|
+
### Metrics Server Configuration
|
599
|
+
|
600
|
+
Whilst metrics are always collected, they're not very useful unless they can
|
601
|
+
be scraped by a server. To enable that, you'll need to look at the following
|
602
|
+
configuration variables. All metrics configuration environment variables are
|
603
|
+
all-uppercase, and the `<SERVICENAME>_` portion is the all-uppercase version
|
604
|
+
of [the service name](#the-service-name).
|
605
|
+
|
606
|
+
* **`<SERVICENAME>_METRICS_PORT`** (integer; range 1..65535; default: `""`) --
|
607
|
+
if set to an integer which is a valid port number (`1` to `65535`,
|
608
|
+
inclusive), an HTTP server will be started which will respond to a request to
|
609
|
+
`/metrics` with a Prometheus-compatible dump of time series data.
|
610
|
+
|
611
|
+
|
612
|
+
## Signal Handling
|
613
|
+
|
614
|
+
Whilst they're a bit old-fashioned, there's no denying that signals still have
|
615
|
+
a useful place in the arsenal of a modern service. However, there are some
|
616
|
+
caveats that apply to signal handling (like their habit of interrupting at
|
617
|
+
inconvenient moments when you can't use mutexes). For that reason, the
|
618
|
+
`ServiceSkeleton` comes with a signal watcher, which converts specified incoming
|
619
|
+
signals into invocations of regular blocks of code, and a range of default
|
620
|
+
behaviours for common signals.
|
621
|
+
|
622
|
+
|
623
|
+
### Default Signals
|
624
|
+
|
625
|
+
When the `#run` method on a `ServiceSkeleton::Runner` instance is called, the
|
626
|
+
following signals will be hooked, and will perform the described action when
|
627
|
+
that signal is received:
|
628
|
+
|
629
|
+
* **`SIGUSR1`** -- increase the default minimum severity for messages which
|
630
|
+
will be emitted by the logger (`FATAL` -> `ERROR` -> `WARN` -> `INFO` ->
|
631
|
+
`DEBUG`). The default severity only applies to log messages whose progname
|
632
|
+
does not match a "progname/severity" pair (see [Logging
|
633
|
+
Configuration](#logging-configuration)).
|
634
|
+
|
635
|
+
* **`SIGUSR2`** -- decrease the default minimum severity for messages which
|
636
|
+
will be emitted by the logger.
|
637
|
+
|
638
|
+
* **`SIGHUP`** -- close and reopen the log file, if logging to a file on disk.
|
639
|
+
Because of the `ServiceSkeleton`'s default log rotation policy, this shouldn't
|
640
|
+
ordinarily be required, but if you've turned off the default log rotation,
|
641
|
+
you may need this.
|
642
|
+
|
643
|
+
* **`SIGQUIT`** -- dump a *whooooooole* lot of debugging information to
|
644
|
+
standard error, including memory allocation summaries and stack traces of all
|
645
|
+
running threads. If you've ever sent `SIGQUIT` a Java program, or
|
646
|
+
`SIGABRT` to a golang program, you know how handy this can be in certain
|
647
|
+
circumstances.
|
648
|
+
|
649
|
+
* **`SIGINT`** / **`SIGTERM`** -- ask the service to gracefully stop running.
|
650
|
+
It will call your service's `#shutdown` method to ask it to stop what it's
|
651
|
+
doing and exit. If the signal is sent a second time, the service will be
|
652
|
+
summarily terminated as soon as practical, without being given the
|
653
|
+
opportunity to gracefully release resources. As usual, if a service process
|
654
|
+
needs to be whacked completely and utterly *right now*, `SIGKILL` is what you
|
655
|
+
want to use.
|
656
|
+
|
657
|
+
|
658
|
+
### Hooking Signals
|
659
|
+
|
660
|
+
In addition to the above default signal dispositions, you can also hook signals
|
661
|
+
yourself for whatever purpose you desire. This is typically done in your
|
662
|
+
`#run` method, before entering the main service loop.
|
663
|
+
|
664
|
+
To hook a signal, just call `hook_signal` with a signal specification and a
|
665
|
+
block of code to execute when the signal fires in your class definition. You
|
666
|
+
can even hook the same signal more than once, because the signal handlers that
|
667
|
+
`ServiceSkeleton` uses chain to other signal handlers. As an example, if you
|
668
|
+
want to print "oof!" every time the `SIGCONT` signal is received, you'd do
|
669
|
+
something like this:
|
670
|
+
|
671
|
+
class MyService
|
672
|
+
include ServiceSkeleton
|
673
|
+
|
674
|
+
hook_signal("CONT") { puts "oof!" }
|
675
|
+
|
676
|
+
def run
|
677
|
+
loop { sleep }
|
678
|
+
end
|
679
|
+
end
|
680
|
+
|
681
|
+
The code in the block will be executed in the context of the service worker
|
682
|
+
instance that is running at the time the signal is received. You are
|
683
|
+
responsible for ensuring that whatever your handler does is concurrency-safe.
|
684
|
+
|
685
|
+
When the service is shutdown, all signal handlers will be automatically
|
686
|
+
unhooked, which saves you having to do it yourself.
|
687
|
+
|
688
|
+
|
689
|
+
## HTTP Admin Interface
|
690
|
+
|
691
|
+
In these modern times we live in, it seems everything from nuclear reactors to
|
692
|
+
toasters can be controlled from a browser. Why should your services be any
|
693
|
+
different?
|
694
|
+
|
695
|
+
|
696
|
+
### HTTP Admin Configuration
|
697
|
+
|
698
|
+
In the spirit of "secure by default", you must explicitly enable the HTTP admin
|
699
|
+
interface, and configure an authentication method. To do that, use the
|
700
|
+
following environment variables, where `<SERVICENAME>_` is the all-uppercase
|
701
|
+
version of [the service name](#the-service-name).
|
702
|
+
|
703
|
+
* **`<SERVICENAME>_HTTP_ADMIN_PORT`** (integer; range 1..65535; default: `""`)
|
704
|
+
-- if set to a valid port number (`1` to `65535` inclusive), the HTTP admin
|
705
|
+
interface will listen on that port, if also enabled by configuring
|
706
|
+
authentication.
|
707
|
+
|
708
|
+
* **`<SERVICENAME>_HTTP_ADMIN_BASIC_AUTH`** (string; default: `""`) -- if set
|
709
|
+
to a string containing a username and password separated by a colon, then
|
710
|
+
authentication via [HTTP Basic auth](https://tools.ietf.org/html/rfc7617)
|
711
|
+
will be supported. Note that in addition to this setting, an admin port must
|
712
|
+
also be specified in order for the admin interface to be enabled.
|
713
|
+
|
714
|
+
* **`<SERVICENAME>_HTTP_ADMIN_PROXY_USERNAME_HEADER`** (string; default: `""`)
|
715
|
+
-- if set to a non-empty string, then incoming requests will be examined for
|
716
|
+
a HTTP header with the specified name. If such a header exists and has a
|
717
|
+
non-empty value, then the request will be deemed to have been authenticated
|
718
|
+
by an upstream authenticating proxy (such as
|
719
|
+
[`discourse-auth-proxy`](https://github.com/discourse/discourse-auth-proxy))
|
720
|
+
as the user given in the header value. Note that in addition to this
|
721
|
+
setting, an admin port must also be specified in order for the admin
|
722
|
+
interface to be enabled.
|
723
|
+
|
724
|
+
|
725
|
+
### HTTP Admin Usage
|
726
|
+
|
727
|
+
The HTTP admin interface provides both an interactive, browser-based mode,
|
728
|
+
as well as a RESTful interface, which should, in general, provide equivalent
|
729
|
+
functionality.
|
730
|
+
|
731
|
+
* Visiting the service's `IP address:port` in a web browser will bring up an HTML
|
732
|
+
interface showing all the features that are available. Usage should
|
733
|
+
(hopefully) be self-explanatory.
|
734
|
+
|
735
|
+
* Visiting the service's `IP address:port` whilst accepting `application/json`
|
736
|
+
responses will provide a directory of links to available endpoints which you
|
737
|
+
can use to interact with the HTTP admin interface programmatically.
|
738
|
+
|
739
|
+
|
740
|
+
# Contributing
|
741
|
+
|
742
|
+
Patches can be sent as [a Github pull
|
743
|
+
request](https://github.com/discourse/service_skeleton). This project is
|
744
|
+
intended to be a safe, welcoming space for collaboration, and contributors
|
745
|
+
are expected to adhere to the [Contributor Covenant code of
|
746
|
+
conduct](CODE_OF_CONDUCT.md).
|
747
|
+
|
748
|
+
|
749
|
+
# Licence
|
750
|
+
|
751
|
+
Unless otherwise stated, everything in this repo is covered by the following
|
752
|
+
copyright notice:
|
753
|
+
|
754
|
+
Copyright (C) 2018, 2019 Civilized Discourse Construction Kit, Inc.
|
755
|
+
Copyright (C) 2019, 2020 Matt Palmer
|
756
|
+
|
757
|
+
This program is free software: you can redistribute it and/or modify it
|
758
|
+
under the terms of the GNU General Public License version 3, as
|
759
|
+
published by the Free Software Foundation.
|
760
|
+
|
761
|
+
This program is distributed in the hope that it will be useful,
|
762
|
+
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
763
|
+
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
764
|
+
GNU General Public License for more details.
|
765
|
+
|
766
|
+
You should have received a copy of the GNU General Public License
|
767
|
+
along with this program. If not, see <http://www.gnu.org/licenses/>.
|