logstash_writer 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,197 @@
1
+ The `LogstashWriter` is an opinionated, reliable, and standards-observant
2
+ implementation of a means of getting events to a logstash cluster.
3
+
4
+
5
+ # Installation
6
+
7
+ It's a gem:
8
+
9
+ gem install gemplate
10
+
11
+ There's also the wonders of [the Gemfile](http://bundler.io):
12
+
13
+ gem 'gemplate'
14
+
15
+ If you're the sturdy type that likes to run from git:
16
+
17
+ rake install
18
+
19
+ Or, if you've eschewed the convenience of Rubygems entirely, then you
20
+ presumably know what to do already.
21
+
22
+
23
+ # Logstash Configuration
24
+
25
+ In order for logstash to receive the events being written, it must have a
26
+ `json_lines` TCP input configured. Something like this will do the trick:
27
+
28
+ input {
29
+ tcp {
30
+ id => "json_lines"
31
+ port => 5151
32
+ codec => "json_lines"
33
+ }
34
+ }
35
+
36
+ We'd really like to support the more featureful lumberjack (or, these days,
37
+ "beats") protocol, but [Elastic refuses to document it
38
+ properly](https://github.com/elastic/libbeat/issues/279), so until such time
39
+ as that is fixed, we are stuck with the `json_lines` approach.
40
+
41
+
42
+ # Usage
43
+
44
+ An instance of `LogstashWriter` needs to be given the location of a server
45
+ (or servers) to send the events to. This can be any of:
46
+
47
+ # An IPv4 address and port
48
+ lw = LogstashWriter.new(server_name: "192.0.2.42:5151")
49
+
50
+ # An IPv6 address and port
51
+ lw = LogstashWriter.new(server_name: "[2001:db8::42]:5151")
52
+ # ... or without the brackets, if you like to live dangerously:
53
+ lw = LogstashWriter.new(server_name: "2001:db8::42:5151")
54
+
55
+ # A hostname that resolves to one or more A/AAAA addresses, and port
56
+ lw = LogstashWriter.new(server_name: "logstash:5151")
57
+
58
+ # A DNS name that resolves to one or more SRV records (which
59
+ # specify the port as part of the record)
60
+ lw = LogstashWriter.new(server_name: "_logstash._tcp")
61
+
62
+ Once you have your `LogstashWriter` instance, you can start firing
63
+ events:
64
+
65
+ lw.send_event(any: "hash", you: "like")
66
+
67
+ However they won't actually be sent to the logstash server until you start
68
+ the background worker thread:
69
+
70
+ lw.run
71
+
72
+ When it comes time to shutdown, you can do so gracefully, like this:
73
+
74
+ lw.stop
75
+
76
+ This will wait for all events in the queue to drain to the logstash server
77
+ before returning.
78
+
79
+ In the event that a logstash server is unavailable at the time your events
80
+ are sent, events will be queued until a server is contactable. However,
81
+ because memory is a finite resource, the backlog is limited to 1,000 events
82
+ by default. If you want a larger (or smaller) limit, tell the writer when
83
+ you create it:
84
+
85
+ lw = LogstashWriter.new(server_name: "...", backlog: 1_000_000)
86
+
87
+ If you want to know what your writer is doing, give it a logger:
88
+
89
+ lw = LogstashWriter.new(server_name: "...", logger: Logger.new("/dev/stderr")
90
+
91
+
92
+ ## Prometheus Metrics
93
+
94
+ If you're instrumentally inclined, you can get Prometheus metrics
95
+ out of the writer by passing a client registry (which you'll presumably know
96
+ what to do with if you're into that sort of thing):
97
+
98
+ reg = Prometheus::Client::Registry.new
99
+ lw = LogstashWriter.new(server_name: "...", metrics_registry: reg)
100
+
101
+ The metrics that are exposed are:
102
+
103
+ * **`logstash_writer_events_received_total`** -- the number of events that
104
+ have been submitted for writing by calling `#send_event`.
105
+
106
+ * **`logstash_writer_events_written_total`** -- the number of events that
107
+ have been submitted to the logstash server, labelled by `server` (the
108
+ `address:port` pair for the server that each event was submitted to).
109
+
110
+ * **`logstash_writer_events_dropped_total`** -- the number of events
111
+ that were dropped due to the backlog buffer filling up. An increase
112
+ in this value over time indicates that your logstash servers are either
113
+ unreliable, or unable to cope with peak event ingestion loads.
114
+
115
+ * **`logstash_writer_queue_size`** -- the number of events currently in
116
+ the backlog queue awaiting transmission. In *theory*, this value should
117
+ always be `received - (sent + dropped)`, but this gauge is maintained
118
+ separately as a cross-check in case of bugs.
119
+
120
+ * **`logstash_writer_last_sent_event_timestamp`** -- the UTC timestamp,
121
+ represented as the number of (fractional) seconds since the Unix epoch, at
122
+ which the most recent event sent to a logstash server was originally
123
+ submitted via `#send_event`. This might require some unpacking.
124
+
125
+ If everything is going along swimmingly, there's no queued events, and
126
+ events submitted are immediately forwarded to logstash, this gauge will
127
+ be whenever the last event was sent. No big problem. However, in the
128
+ event of problems, this timestamp can tell you several things.
129
+
130
+ Firstly, if there are queued events, you can tell how far behind in real
131
+ time your logstash event history is, by calculating `NOW() -
132
+ logstash_writer_last_sent_event_timestamp`. Thus, if you're not finding
133
+ events in your Kibana dashboard you were expecting to see, you can tell
134
+ that there's a clog in the pipes by looking at this.
135
+
136
+ Alternately, if the queue is empty, but this timestamp is perhaps older
137
+ than you'd expect, then you know the problem is "upstream" of
138
+ `LogstashWriter`. If your code isn't calling `#send_event`, then this
139
+ timestamp won't be progressing, and you can go look for a deadlock or
140
+ something in your code, and don't need to check if logstash is misbehaving
141
+ (again).
142
+
143
+ * **`logstash_writer_connected_to_server`** -- this flag timeseries (can be
144
+ either `1` or `0`) is simply a way for you to quickly determine whether
145
+ the writer has a server to talk to, if it wants one. That is, this time
146
+ series will only be `0` if there's an event to write but no logstash
147
+ server can be found to write it to.
148
+
149
+ * **`logstash_writer_connect_exceptions_total`** -- a count of exceptions
150
+ raised whilst attempting to connect to a logstash server, labelled by the
151
+ exception class and the server to which the connection was attempted.
152
+
153
+ * **`logstash_writer_write_exceptions_total`** -- a count of exceptions
154
+ raised whilst attempting to write data to a connected logstash server,
155
+ labelled by the exception class and the server to which the write was
156
+ directed.
157
+
158
+ * **`logstash_writer_write_loop_exceptions_total`** -- a count of exceptions
159
+ raised in the "write loop", which is the main infinite loop executed by
160
+ the background worker thread. Exceptions which occur here are...
161
+ concerning, because whilst exceptions are expected while connecting and
162
+ writing to logstash servers, the write loop *itself* shouldn't normally
163
+ be flinging exceptions around.
164
+
165
+ * **`logstash_writer_write_loop_ok`** -- a flag (can be either `1` or `0`)
166
+ indicating whether the write loop is dead or not. This is, essentially,
167
+ the `up` series for the logstash writer; if this is `0`, nothing useful is
168
+ happening in the logstash writer.
169
+
170
+
171
+ # Contributing
172
+
173
+ Patches can be sent as [a Github pull
174
+ request](https://github.com/discourse/logstash-writer). This project is
175
+ intended to be a safe, welcoming space for collaboration, and contributors
176
+ are expected to adhere to the [Contributor Covenant code of
177
+ conduct](CODE_OF_CONDUCT.md).
178
+
179
+
180
+ # Licence
181
+
182
+ Unless otherwise stated, everything in this repo is covered by the following
183
+ copyright notice:
184
+
185
+ Copyright (C) 2015 Civilized Discourse Construction Kit, Inc.
186
+
187
+ This program is free software: you can redistribute it and/or modify it
188
+ under the terms of the GNU General Public License version 3, as
189
+ published by the Free Software Foundation.
190
+
191
+ This program is distributed in the hope that it will be useful,
192
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
193
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
194
+ GNU General Public License for more details.
195
+
196
+ You should have received a copy of the GNU General Public License
197
+ along with this program. If not, see <http://www.gnu.org/licenses/>.
@@ -0,0 +1,498 @@
1
+ require 'ipaddr'
2
+ require 'json'
3
+ require 'resolv'
4
+ require 'socket'
5
+ require 'prometheus/client'
6
+
7
+ # Write messages to a logstash server.
8
+ #
9
+ # Flings events, represented as JSON objects, to logstash using the
10
+ # `json_lines` codec (over TCP). Doesn't do any munging or modification of
11
+ # the event data given to it, other than adding `@timestamp` and `_id`
12
+ # fields if they do not already exist.
13
+ #
14
+ # We support highly-available logstash installations by means of multiple
15
+ # address records, or via SRV records. See the docs for .new for details
16
+ # as to the valid formats for the server.
17
+ #
18
+ class LogstashWriter
19
+ # How long, in seconds, to pause the first time an error is encountered.
20
+ # Each successive error will cause a longer wait, so as to prevent
21
+ # thundering herds.
22
+ INITIAL_RETRY_WAIT = 0.5
23
+
24
+ # Create a new logstash writer.
25
+ #
26
+ # Once the object is created, you're ready to give it messages by
27
+ # calling #send_event. No messages will actually be *delivered* to
28
+ # logstash, though, until you call #run.
29
+ #
30
+ # If multiple addresses are returned from an A/AAAA resolution, or
31
+ # multiple SRV records, then the records will all be tried in random
32
+ # order (for A/AAAA records) or in line with the standard rules for
33
+ # weight and priority (for SRV records).
34
+ #
35
+ # @param server_name [String] details for connecting to the logstash
36
+ # server(s). This can be:
37
+ #
38
+ # * `<IPv4 address>:<port>` -- a literal IPv4 address, and mandatory
39
+ # port.
40
+ #
41
+ # * `[<IPv6 address>]:<port>` -- a literal IPv6 address, and mandatory
42
+ # port. enclosing the address in square brackets isn't required, but
43
+ # it's a serving suggestion to make it a little easier to discern
44
+ # address from port. Forgetting the include the port will end in
45
+ # confusion.
46
+ #
47
+ # * `<hostname>:<port>` -- the given hostname will be resolved for
48
+ # A/AAAA records, and all returned addresses will be tried in random
49
+ # order until one is found that accepts a connection.
50
+ #
51
+ # * `<dnsname>` -- the given dnsname will be resolved for SRV records,
52
+ # and the returned target hostnames and ports will be tried in the
53
+ # RFC2782-approved manner according to priority and weight until one
54
+ # is found which accepts a connection.
55
+ #
56
+ # @param logger [Logger] something to which we can write log entries
57
+ # for debugging and error-reporting purposes.
58
+ #
59
+ # @param backlog [Integer] a non-negative integer specifying the maximum
60
+ # number of events that should be queued during periods when the
61
+ # logstash server is unavailable. If the limit is exceeded, the oldest
62
+ # (= first event to be queued) will be dropped.
63
+ #
64
+ # @param metrics_registry [Prometheus::Client::Registry] where to register
65
+ # the metrics instrumenting the operation of the writer instance.
66
+ #
67
+ # @param metrics_prefix [#to_s] what to prefix all of the metrics used to
68
+ # instrument the operation of the writer instance. If you instantiate
69
+ # multiple LogstashWriter instances with the same `stats_registry`, this
70
+ # parameter *must* be different for each of them, or you will get some
71
+ # inscrutable exception raised from the registry.
72
+ #
73
+ def initialize(server_name:, logger: Logger.new("/dev/null"), backlog: 1_000, metrics_registry: Prometheus::Client::Registry.new, metrics_prefix: :logstash_writer)
74
+ @server_name, @logger, @backlog = server_name, logger, backlog
75
+
76
+ @metrics = {
77
+ received: metrics_registry.counter(:"#{metrics_prefix}_events_received_total", "The number of logstash events which have been submitted for delivery"),
78
+ sent: metrics_registry.counter(:"#{metrics_prefix}_events_written_total", "The number of logstash events which have been delivered to the logstash server"),
79
+ queue_size: metrics_registry.gauge(:"#{metrics_prefix}_queue_size", "The number of events currently in the queue to be sent"),
80
+ dropped: metrics_registry.counter(:"#{metrics_prefix}_events_dropped_total", "The number of events which have been dropped from the queue"),
81
+
82
+ lag: metrics_registry.gauge(:"#{metrics_prefix}_last_sent_event_timestamp", "When the last event successfully sent to logstash was originally received"),
83
+
84
+ connected: metrics_registry.gauge(:"#{metrics_prefix}_connected_to_server", "Boolean flag indicating whether we are currently connected to a logstash server"),
85
+ connect_exception: metrics_registry.counter(:"#{metrics_prefix}_connect_exceptions_total", "The number of exceptions that have occurred whilst attempting to connect to a logstash server"),
86
+ write_exception: metrics_registry.counter(:"#{metrics_prefix}_write_exceptions_total", "The number of exceptions that have occurred whilst attempting to write an event to a logstash server"),
87
+
88
+ write_loop_exception: metrics_registry.counter(:"#{metrics_prefix}_write_loop_exceptions_total", "The number of exceptions that have occurred in the writing loop"),
89
+ write_loop_ok: metrics_registry.gauge(:"#{metrics_prefix}_write_loop_ok", "Boolean flag indicating whether the writing loop is currently operating correctly, or is in a post-apocalyptic hellscape of never-ending exceptions"),
90
+ }
91
+
92
+ @metrics[:lag].set({}, 0)
93
+ @metrics[:queue_size].set({}, 0)
94
+
95
+ @queue = []
96
+ @queue_mutex = Mutex.new
97
+ @queue_cv = ConditionVariable.new
98
+
99
+ @socket_mutex = Mutex.new
100
+ @worker_mutex = Mutex.new
101
+ end
102
+
103
+ # Add an event to the queue, to be sent to logstash. Actual event
104
+ # delivery will happen in a worker thread that is started with
105
+ # #run. If the event does not have a `@timestamp` or `_id` element, they
106
+ # will be added with appropriate values.
107
+ #
108
+ # @param e [Hash] the event data to be sent.
109
+ #
110
+ # @return [NilClass]
111
+ #
112
+ def send_event(e)
113
+ unless e.is_a?(Hash)
114
+ raise ArgumentError, "Event must be a hash"
115
+ end
116
+
117
+ unless e.has_key?(:@timestamp) || e.has_key?("@timestamp")
118
+ e[:@timestamp] = Time.now.utc.strftime("%FT%TZ")
119
+ end
120
+
121
+ unless e.has_key?(:_id) || e.has_key?("_id")
122
+ # This is the quickest way I've found to get a long, random string.
123
+ # We don't need any sort of cryptographic or unpredictability
124
+ # guarantees for what we're doing here, so SecureRandom is unnecessary
125
+ # overhead.
126
+ e[:_id] = rand(0x1000_0000_0000_0000_0000_0000_0000_0000).to_s(36)
127
+ end
128
+
129
+ @queue_mutex.synchronize do
130
+ @queue << { content: e, arrival_timestamp: Time.now }
131
+ while @queue.length > @backlog
132
+ @queue.shift
133
+ stat_dropped
134
+ end
135
+ @queue_cv.signal
136
+
137
+ stat_received
138
+ end
139
+
140
+ nil
141
+ end
142
+
143
+ # Start sending events.
144
+ #
145
+ # This method will return almost immediately, and actual event
146
+ # transmission will commence in a separate thread.
147
+ #
148
+ # @return [NilClass]
149
+ #
150
+ def run
151
+ @worker_mutex.synchronize do
152
+ if @worker_thread.nil?
153
+ m, cv = Mutex.new, ConditionVariable.new
154
+
155
+ @worker_thread = Thread.new { cv.signal; write_loop }
156
+
157
+ # Don't return until the thread has *actually* started
158
+ m.synchronize { cv.wait(m) }
159
+ end
160
+ end
161
+
162
+ nil
163
+ end
164
+
165
+ # Stop the worker thread.
166
+ #
167
+ # Politely ask the worker thread to please finish up once it's
168
+ # finished sending all messages that have been queued. This will
169
+ # return once the worker thread has finished.
170
+ #
171
+ # @return [NilClass]
172
+ #
173
+ def stop
174
+ @worker_mutex.synchronize do
175
+ if @worker_thread
176
+ @terminate = true
177
+ @queue_cv.signal
178
+ begin
179
+ @worker_thread.join
180
+ rescue Exception => ex
181
+ @logger.error("LogstashWriter") { (["Worker thread terminated with exception: #{ex.message} (#{ex.class})"] + ex.backtrace).join("\n ") }
182
+ end
183
+ @worker_thread = nil
184
+ @socket_mutex.synchronize { (@current_socket.close; @current_socket = nil) if @current_socket }
185
+ end
186
+ end
187
+
188
+ nil
189
+ end
190
+
191
+ # Disconnect from the currently-active server.
192
+ #
193
+ # In certain circumstances, you may wish to force the writer to stop
194
+ # sending messages to the currently-connected logstash server, and
195
+ # re-resolve the `server_name` to get new a new address to talk to.
196
+ # Calling this method will cause that to happen.
197
+ #
198
+ # @return [NilClass]
199
+ #
200
+ def force_disconnect!
201
+ @socket_mutex.synchronize do
202
+ return if @current_socket.nil?
203
+
204
+ @logger.info("LogstashWriter") { "Forced disconnect from #{describe_peer(@current_socket) }" }
205
+ @current_socket.close if @current_socket
206
+ @current_socket = nil
207
+ end
208
+
209
+ nil
210
+ end
211
+
212
+ private
213
+
214
+ # The main "worker" method for getting events out of the queue and
215
+ # firing them at logstash.
216
+ #
217
+ def write_loop
218
+ error_wait = INITIAL_RETRY_WAIT
219
+
220
+ catch :terminate do
221
+ loop do
222
+ event = nil
223
+
224
+ begin
225
+ @queue_mutex.synchronize do
226
+ while @queue.empty? && !@terminate
227
+ @queue_cv.wait(@queue_mutex)
228
+ end
229
+
230
+ if @queue.empty? && @terminate
231
+ @terminate = false
232
+ throw :terminate
233
+ end
234
+
235
+ event = @queue.shift
236
+ end
237
+
238
+ current_socket do |s|
239
+ s.puts event[:content].to_json
240
+ stat_sent(describe_peer(s), event[:arrival_timestamp])
241
+ @metrics[:write_loop_ok].set({}, 1)
242
+ error_wait = INITIAL_RETRY_WAIT
243
+ end
244
+ rescue StandardError => ex
245
+ @logger.error("LogstashWriter") { (["Exception in write_loop: #{ex.message} (#{ex.class})"] + ex.backtrace).join("\n ") }
246
+ @queue_mutex.synchronize { @queue.unshift(event) if event }
247
+ @metrics[:write_loop_exception].increment(class: ex.class.to_s)
248
+ @metrics[:write_loop_ok].set({}, 0)
249
+ sleep error_wait
250
+ # Increase the error wait timeout for next time, up to a maximum
251
+ # interval of about 60 seconds
252
+ error_wait *= 1.1
253
+ error_wait = 60 if error_wait > 60
254
+ error_wait += rand / 0.5
255
+ end
256
+ end
257
+ end
258
+ end
259
+
260
+ # Yield a TCPSocket connected to the server we currently believe to be
261
+ # accepting log entries, so that something can send log entries to it.
262
+ #
263
+ # The yielding allows us to centralise all error detection and handling
264
+ # within this one method, and retry sending just by calling `yield` again
265
+ # when we've connected to another server.
266
+ #
267
+ def current_socket
268
+ # This could all be handled more cleanly with recursion, but I don't
269
+ # want to fill the stack if we have to retry a lot of times. Also
270
+ # can't just use `retry` because not all of the "go around again"
271
+ # conditions are due to exceptions.
272
+ done = false
273
+
274
+ until done
275
+ @socket_mutex.synchronize do
276
+ if @current_socket
277
+ begin
278
+ @logger.debug("LogstashWriter") { "Using current server #{describe_peer(@current_socket)}" }
279
+ yield @current_socket
280
+ @metrics[:connected].set({}, 1)
281
+ done = true
282
+ rescue SystemCallError => ex
283
+ # Something went wrong during the send; disconnect from this
284
+ # server and recycle
285
+ @metrics[:write_exception].increment(server: describe_peer(@current_socket), class: ex.class.to_s)
286
+ @logger.info("LogstashWriter") { "Error while writing to current server: #{ex.message} (#{ex.class})" }
287
+ @current_socket.close
288
+ @current_socket = nil
289
+ @metrics[:connected].set({}, 0)
290
+
291
+ sleep INITIAL_RETRY_WAIT
292
+ end
293
+ else
294
+ retry_delay = INITIAL_RETRY_WAIT * 10
295
+ candidates = resolve_server_name
296
+ @logger.debug("LogstashWriter") { "Server candidates: #{candidates.inspect}" }
297
+
298
+ if candidates.empty?
299
+ # A useful error message will (should?) have been logged by something
300
+ # down in the bowels of resolve_server_name, so all we have to do
301
+ # is wait a little while, then let the loop retry.
302
+ sleep INITIAL_RETRY_WAIT * 10
303
+ else
304
+ begin
305
+ next_server = candidates.shift
306
+
307
+ if next_server
308
+ @logger.debug("LogstashWriter") { "Trying to connect to #{next_server.to_s}" }
309
+ @current_socket = next_server.socket
310
+ else
311
+ @logger.debug("LogstashWriter") { "Could not connect to any server; pausing before trying again" }
312
+ @current_socket = nil
313
+ sleep retry_delay
314
+
315
+ # Calculate a longer retry delay next time we fail to connect
316
+ # to every server in the list, up to a maximum of (roughly) 60
317
+ # seconds.
318
+ retry_delay *= 1.5
319
+ retry_delay = 60 if retry_delay > 60
320
+ # A bit of randomness to prevent the thundering herd never goes
321
+ # amiss
322
+ retry_delay += rand
323
+ end
324
+ rescue SystemCallError => ex
325
+ # Connection failed for any number of reasons; try the next one in the list
326
+ @metrics[:connect_exception].increment(server: next_server.to_s, class: ex.class.to_s)
327
+ @logger.error("LogstashWriter") { "Failed to connect to #{next_server.to_s}: #{ex.message} (#{ex.class})" }
328
+ sleep INITIAL_RETRY_WAIT
329
+ retry
330
+ end
331
+ end
332
+ end
333
+ end
334
+ end
335
+ end
336
+
337
+ # Generate a human-readable description of the remote end of the given
338
+ # socket.
339
+ #
340
+ def describe_peer(s)
341
+ pa = s.peeraddr
342
+ if pa[0] == "AF_INET6"
343
+ "[#{pa[3]}]:#{pa[1]}"
344
+ else
345
+ "#{pa[3]}:#{pa[1]}"
346
+ end
347
+ end
348
+
349
+ # Turn the server_name given in the constructor into a list of Target
350
+ # objects, suitable for iterating through to find someone to talk to.
351
+ #
352
+ def resolve_server_name
353
+ return [static_target] if static_target
354
+
355
+ # The IPv6 literal case should have been taken care of by
356
+ # static_target, so the only two cases we have to deal with
357
+ # here are specified-port (assume A/AAAA) or no port (assume SRV).
358
+ if @server_name =~ /:/
359
+ host, port = @server_name.split(":", 2)
360
+ targets_from_address_record(host, port)
361
+ else
362
+ targets_from_srv_record(host)
363
+ end
364
+ end
365
+
366
+ # Figure out whether the server spec we were given looks like an address:port
367
+ # combo (in which case return a memoised target), else return `nil` to let
368
+ # the DNS take over.
369
+ def static_target
370
+ # It is valid to memoize this because address literals don't change
371
+ # their resolution over time.
372
+ @static_target ||= begin
373
+ if @server_name =~ /\A(.*):(\d+)\z/
374
+ begin
375
+ IPAddr.new($1)
376
+ rescue ArgumentError
377
+ # Whatever is on the LHS isn't a recognisable address literal;
378
+ # assume hostname
379
+ nil
380
+ else
381
+ Target.new($1, $2.to_i)
382
+ end
383
+ end
384
+ end
385
+ end
386
+
387
+ # Resolve hostname as A/AAAA, and generate randomly-sorted list of Target
388
+ # records from the list of addresses resolved.
389
+ #
390
+ def targets_from_address_record(hostname, port)
391
+ addrs = Resolv::DNS.new.getaddresses(hostname)
392
+ if addrs.empty?
393
+ @logger.warn("LogstashWriter") { "No addresses resolved for server_name #{hostname.inspect}" }
394
+ end
395
+ addrs.sort_by { rand }.map { |a| Target.new(a.to_s, port.to_i) }
396
+ end
397
+
398
+ # Resolve the given hostname as a SRV record, and generate a list of
399
+ # Target records from the resources returned. The list will be arranged
400
+ # in line with the RFC2782-specified algorithm, respecting the weight and
401
+ # priority of the records.
402
+ #
403
+ def targets_from_srv_record(hostname)
404
+ [].tap do |list|
405
+ left = Resolv::DNS.new.getresources(@server_name, Resolv::DNS::Resource::IN::SRV)
406
+ if left.empty?
407
+ @logger.warn("LogstashWriter") { "No SRV records found for server_name #{@server_name.inspect}" }
408
+ end
409
+
410
+ # Let the soft-SRV shuffle... BEGIN!
411
+ until left.empty?
412
+ prio = left.map { |rr| rr.priority }.uniq.min
413
+ candidates = left.select { |rr| rr.priority == prio }
414
+ left -= candidates
415
+ candidates.sort_by! { |rr| [rr.weight, rr.target.to_s] }
416
+ until candidates.empty?
417
+ selector = rand(candidates.inject(1) { |n, rr| n + rr.weight })
418
+ chosen = candidates.inject(0) do |n, rr|
419
+ break rr if n + rr.weight >= selector
420
+ n + rr.weight
421
+ end
422
+ candidates.delete(chosen)
423
+ list << Target.new(chosen.target.to_s, chosen.port)
424
+ end
425
+ end
426
+ end
427
+ end
428
+
429
+ def stat_received
430
+ @metrics[:received].increment({})
431
+ @metrics[:queue_size].increment({})
432
+ end
433
+
434
+ def stat_sent(peer, arrived_time)
435
+ @metrics[:sent].increment(server: peer)
436
+ @metrics[:queue_size].decrement({})
437
+ @metrics[:lag].set({}, arrived_time.to_f)
438
+ end
439
+
440
+ def stat_dropped
441
+ @metrics[:queue_size].decrement({})
442
+ @metrics[:dropped].increment({})
443
+ end
444
+
445
+ # An individual target for logstash messages
446
+ #
447
+ # Takes a host and port, gives back a socket to send data down.
448
+ #
449
+ class Target
450
+ # Create a new target.
451
+ #
452
+ # @param addr [String] an IP address or hostname to which to connect.
453
+ #
454
+ # @param port [Integer] the TCP port number, in the range 1-65535.
455
+ #
456
+ # @raise [ArgumentError] if `addr` is not a valid-looking IP address or
457
+ # hostname, or if the port number is not in the valid range.
458
+ #
459
+ def initialize(addr, port)
460
+ #:nocov:
461
+ unless addr.is_a? String
462
+ raise ArgumentError, "addr #{addr.inspect} is not a string"
463
+ end
464
+
465
+ unless port.is_a? Integer
466
+ raise ArgumentError, "port #{port.inspect} is not an integer"
467
+ end
468
+
469
+ unless (1..65535).include?(port)
470
+ raise ArgumentError, "invalid port number #{port.inspect} (must be in range 1-65535)"
471
+ end
472
+ #:nocov:
473
+
474
+ @addr, @port = addr, port
475
+ end
476
+
477
+ # Create a connection.
478
+ #
479
+ # @return [IO] a socket to the target.
480
+ #
481
+ # @raise [SystemCallError] if connection cannot be established
482
+ # for any reason.
483
+ #
484
+ def socket
485
+ TCPSocket.new(@addr, @port)
486
+ end
487
+
488
+ # Simple string representation of the target.
489
+ #
490
+ # @return [String]
491
+ #
492
+ def to_s
493
+ "#{@addr}:#{@port}"
494
+ end
495
+ end
496
+
497
+ private_constant :Target
498
+ end