fluent-plugin-kubernetes_metadata_filter 0.21.0 → 0.22.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: b20a61e09187754c82416670f602ee2588399078
4
- data.tar.gz: ff91664e2431bf6d419b9a9a4ef04367ce0d8dae
3
+ metadata.gz: 300f1d004ac647a2b8ecbbd255b25ccc37865794
4
+ data.tar.gz: 871d3ccd62bc3d9129fd3b9b78858022aa576d40
5
5
  SHA512:
6
- metadata.gz: 10d00af65128a6b779fb8aab70a39e0ac20d8cf6b3030514eec564d66e3297e04ea43689cf7296e55ab2174332c0d9aac11425bcc506459798c0820cc45ec4fa
7
- data.tar.gz: 2bbd2911b66686211fefcf31f62a076b24b37b3784cf4998f1d5f7e5c9238adc2a1752dd8d9a3645762fbf912f787da3941d84b30d8056a9220ce04d24ed35b6
6
+ metadata.gz: a64f8fc1a0511df3df1afe86c33150b99f3636e95d91e2f385d49fdea82ec7e2d446cc2fe4f1e7cd2d26eb7d47466214c79d3ad6b6f874f0b5ca85d875972e1e
7
+ data.tar.gz: d6a913cd9c1a0ee7def06ef7ef67b008a3a5c3086e92a65d89b5f3fe9e36e26d4507b17484cc3cf421b7d26d56324fd34a7ee6a9d1aaef1b7ca4d0074d5d82d7
data/README.md CHANGED
@@ -27,7 +27,10 @@ This must used named capture groups for `container_name`, `pod_name` & `namespac
27
27
  * `preserve_json_log` - preserve JSON logs in raw form in the `log` key, only used if the previous option is true (default: `true`)
28
28
  * `de_dot` - replace dots in labels with configured `de_dot_separator`, required for ElasticSearch 2.x compatibility (default: `true`)
29
29
  * `de_dot_separator` - separator to use if `de_dot` is enabled (default: `_`)
30
+ * `use_journal` - If false (default), messages are expected to be formatted and tagged as if read by the fluentd in\_tail plugin with wildcard filename. If true, messages are expected to be formatted as if read from the systemd journal. The `MESSAGE` field has the full message. The `CONTAINER_NAME` field has the encoded k8s metadata (see below). The `CONTAINER_ID_FULL` field has the full container uuid. This requires docker to use the `--log-driver=journald` log driver.
31
+ * `container_name_to_kubernetes_regexp` - The regular expression used to extract the k8s metadata encoded in the journal `CONTAINER_NAME` field (default: `'^k8s_(?<container_name>[^\.]+)\.(?<container_hash>[a-z0-9]{8})_(?<pod_name>[^_]+)_(?<namespace>[^_]+)_(?<pod_id>[^_]+)_(?<pod_randhex>[a-z0-9]{8})$'`)
30
32
 
33
+ Reading from the JSON formatted log files with `in_tail` and wildcard filenames:
31
34
  ```
32
35
  <source>
33
36
  type tail
@@ -48,11 +51,39 @@ This must used named capture groups for `container_name`, `pod_name` & `namespac
48
51
  </match>
49
52
  ```
50
53
 
54
+ Reading from the systemd journal (requires the fluentd `fluent-plugin-systemd` and `systemd-journal` plugins, and requires docker to use the `--log-driver=journald` log driver):
55
+ ```
56
+ <source>
57
+ type systemd
58
+ path /run/log/journal
59
+ pos_file journal.pos
60
+ tag journal
61
+ read_from_head true
62
+ </source>
63
+
64
+ # probably want to use something like fluent-plugin-rewrite-tag-filter to
65
+ # retag entries from k8s
66
+ <match journal>
67
+ @type rewrite_tag_filter
68
+ rewriterule1 CONTAINER_NAME ^k8s_ kubernetes.journal.container
69
+ ...
70
+ </match>
71
+
72
+ <filter kubernetes.**>
73
+ type kubernetes_metadata
74
+ use_journal true
75
+ </filter>
76
+
77
+ <match **>
78
+ type stdout
79
+ </match>
80
+ ```
81
+
51
82
  ## Example input/output
52
83
 
53
84
  Kubernetes creates symlinks to Docker log files in `/var/log/containers/*.log`. Docker logs in JSON format.
54
85
 
55
- Assuming following inputs are coming from a log file:
86
+ Assuming following inputs are coming from a log file named `/var/log/containers/fabric8-console-controller-98rqc_default_fabric8-console-container-df14e0d5ae4c07284fa636d739c8fc2e6b52bc344658de7d3f08c36a2e804115.log`:
56
87
 
57
88
  ```
58
89
  {
@@ -73,9 +104,10 @@ Then output becomes as belows
73
104
  "kubernetes": {
74
105
  "host": "jimmi-redhat.localnet",
75
106
  "pod_name":"fabric8-console-controller-98rqc",
107
+ "pod_id": "c76927af-f563-11e4-b32d-54ee7527188d",
76
108
  "container_name": "fabric8-console-container",
77
- "namespace": "default",
78
- "uid": "c76927af-f563-11e4-b32d-54ee7527188d",
109
+ "namespace_name": "default",
110
+ "namespace_id": "23437884-8e08-4d95-850b-e94378c9b2fd",
79
111
  "labels": {
80
112
  "component": "fabric8Console"
81
113
  }
@@ -83,6 +115,18 @@ Then output becomes as belows
83
115
  }
84
116
  ```
85
117
 
118
+ If using journal input, from docker configured with `--log-driver=journald`, the input looks like the `journalctl -o export` format:
119
+ ```
120
+ # The stream identification is encoded into the PRIORITY field as an
121
+ # integer: 6, or github.com/coreos/go-systemd/journal.Info, marks stdout,
122
+ # while 3, or github.com/coreos/go-systemd/journal.Err, marks stderr.
123
+ PRIORITY=6
124
+ CONTAINER_ID=b6cbb6e73c0a
125
+ CONTAINER_ID_FULL=b6cbb6e73c0ad63ab820e4baa97cdc77cec729930e38a714826764ac0491341a
126
+ CONTAINER_NAME=k8s_registry.a49f5318_docker-registry-1-hhoj0_default_ae3a9bdc-1f66-11e6-80a2-fa163e2fff3a_799e4035
127
+ MESSAGE=172.17.0.1 - - [21/May/2016:16:52:05 +0000] "GET /healthz HTTP/1.1" 200 0 "" "Go-http-client/1.1"
128
+ ```
129
+
86
130
  ## Contributing
87
131
 
88
132
  1. Fork it
data/Rakefile CHANGED
@@ -13,8 +13,8 @@ Rake::TestTask.new(:base_test) do |t|
13
13
  # $ bundle exec rake base_test TEST=test/test_*.rb
14
14
  t.libs << 'test'
15
15
  t.test_files = Dir['test/**/test_*.rb'].sort
16
- t.verbose = true
17
- #t.warning = true
16
+ #t.verbose = true
17
+ t.warning = false
18
18
  end
19
19
 
20
20
  desc 'Add copyright headers'
@@ -4,7 +4,7 @@ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
4
 
5
5
  Gem::Specification.new do |gem|
6
6
  gem.name = "fluent-plugin-kubernetes_metadata_filter"
7
- gem.version = "0.21.0"
7
+ gem.version = "0.22.0"
8
8
  gem.authors = ["Jimmi Dyson"]
9
9
  gem.email = ["jimmidyson@gmail.com"]
10
10
  gem.description = %q{Filter plugin to add Kubernetes metadata}
@@ -42,6 +42,14 @@ module Fluent
42
42
  config_param :secret_dir, :string, default: '/var/run/secrets/kubernetes.io/serviceaccount'
43
43
  config_param :de_dot, :bool, default: true
44
44
  config_param :de_dot_separator, :string, default: '_'
45
+ # if reading from the journal, the record will contain the following fields in the following
46
+ # format:
47
+ # CONTAINER_NAME=k8s_$containername.$containerhash_$podname_$namespacename_$poduuid_$rand32bitashex
48
+ # CONTAINER_FULL_ID=dockeridassha256hexvalue
49
+ config_param :use_journal, :bool, default: false
50
+ config_param :container_name_to_kubernetes_regexp,
51
+ :string,
52
+ :default => '^k8s_(?<container_name>[^\.]+)\.(?<container_hash>[a-z0-9]{8})_(?<pod_name>[^_]+)_(?<namespace>[^_]+)_(?<pod_id>[^_]+)_(?<pod_randhex>[a-z0-9]{8})$'
45
53
 
46
54
  def syms_to_strs(hsh)
47
55
  newhsh = {}
@@ -102,6 +110,7 @@ module Fluent
102
110
  @namespace_cache = LruRedux::TTL::ThreadSafeCache.new(@cache_size, @cache_ttl)
103
111
  end
104
112
  @tag_to_kubernetes_name_regexp_compiled = Regexp.compile(@tag_to_kubernetes_name_regexp)
113
+ @container_name_to_kubernetes_regexp_compiled = Regexp.compile(@container_name_to_kubernetes_regexp)
105
114
 
106
115
  # Use Kubernetes default service account if we're in a pod.
107
116
  if @kubernetes_url.nil?
@@ -161,25 +170,30 @@ module Fluent
161
170
  end
162
171
  end
163
172
  end
173
+ if @use_journal
174
+ @merge_json_log_key = 'MESSAGE'
175
+ self.class.class_eval { alias_method :filter_stream, :filter_stream_from_journal }
176
+ else
177
+ @merge_json_log_key = 'log'
178
+ self.class.class_eval { alias_method :filter_stream, :filter_stream_from_files }
179
+ end
164
180
  end
165
181
 
166
- # NOTE: fluentd requires that records/hashes have string keys, not symbol keys
167
- # http://docs.fluentd.org/articles/plugin-development#record-format
168
- def filter_stream(tag, es)
182
+ def filter_stream_from_files(tag, es)
169
183
  new_es = MultiEventStream.new
170
184
 
171
185
  match_data = tag.match(@tag_to_kubernetes_name_regexp_compiled)
172
186
 
173
187
  if match_data
174
188
  metadata = {
175
- 'docker' => {
176
- 'container_id' => match_data['docker_id']
177
- },
178
- 'kubernetes' => {
179
- 'namespace_name' => match_data['namespace'],
180
- 'pod_name' => match_data['pod_name'],
181
- 'container_name' => match_data['container_name']
182
- }
189
+ 'docker' => {
190
+ 'container_id' => match_data['docker_id']
191
+ },
192
+ 'kubernetes' => {
193
+ 'namespace_name' => match_data['namespace'],
194
+ 'pod_name' => match_data['pod_name'],
195
+ 'container_name' => match_data['container_name']
196
+ }
183
197
  }
184
198
 
185
199
  if @kubernetes_url.present?
@@ -219,14 +233,79 @@ module Fluent
219
233
  new_es
220
234
  end
221
235
 
236
+ def filter_stream_from_journal(tag, es)
237
+ new_es = MultiEventStream.new
238
+
239
+ es.each { |time, record|
240
+ record = merge_json_log(record) if @merge_json_log
241
+
242
+ metadata = nil
243
+ if record.has_key?('CONTAINER_NAME') && record.has_key?('CONTAINER_ID_FULL')
244
+ metadata = record['CONTAINER_NAME'].match(@container_name_to_kubernetes_regexp_compiled) do |match_data|
245
+ metadata = {
246
+ 'docker' => {
247
+ 'container_id' => record['CONTAINER_ID_FULL']
248
+ },
249
+ 'kubernetes' => {
250
+ 'namespace_name' => match_data['namespace'],
251
+ 'pod_name' => match_data['pod_name'],
252
+ 'container_name' => match_data['container_name']
253
+ }
254
+ }
255
+ if @kubernetes_url.present?
256
+ cache_key = "#{metadata['kubernetes']['namespace_name']}_#{metadata['kubernetes']['pod_name']}_#{metadata['kubernetes']['container_name']}"
257
+
258
+ this = self
259
+ metadata = @cache.getset(cache_key) {
260
+ if metadata
261
+ kubernetes_metadata = this.get_metadata(
262
+ metadata['kubernetes']['namespace_name'],
263
+ metadata['kubernetes']['pod_name'],
264
+ metadata['kubernetes']['container_name']
265
+ )
266
+ metadata['kubernetes'] = kubernetes_metadata if kubernetes_metadata
267
+ metadata
268
+ end
269
+ }
270
+ if match_data['pod_id'] && (match_data['pod_id'] != metadata['kubernetes']['pod_id'])
271
+ log.debug("pod_id #{match_data['pod_id']} from log not equal to pod_id #{metadata['kubernetes']['pod_id']} from kubernetes for #{cache_key}")
272
+ end
273
+ if @include_namespace_id
274
+ namespace_name = metadata['kubernetes']['namespace_name']
275
+ namespace_id = @namespace_cache.getset(namespace_name) {
276
+ namespace = @client.get_namespace(namespace_name)
277
+ namespace['metadata']['uid'] if namespace
278
+ }
279
+ metadata['kubernetes']['namespace_id'] = namespace_id if namespace_id
280
+ end
281
+ end
282
+ metadata
283
+ end
284
+ unless metadata
285
+ log.debug "Error: could not match CONTAINER_NAME from record #{record}"
286
+ end
287
+ elsif record.has_key?('CONTAINER_NAME') && record['CONTAINER_NAME'].start_with?('k8s_')
288
+ log.debug "Error: no container name and id in record #{record}"
289
+ end
290
+
291
+ if metadata
292
+ record = record.merge(metadata)
293
+ end
294
+
295
+ new_es.add(time, record)
296
+ }
297
+
298
+ new_es
299
+ end
300
+
222
301
  def merge_json_log(record)
223
- if record.has_key?('log')
224
- log = record['log'].strip
302
+ if record.has_key?(@merge_json_log_key)
303
+ log = record[@merge_json_log_key].strip
225
304
  if log[0].eql?('{') && log[-1].eql?('}')
226
305
  begin
227
306
  record = JSON.parse(log).merge(record)
228
307
  unless @preserve_json_log
229
- record.delete('log')
308
+ record.delete(@merge_json_log_key)
230
309
  end
231
310
  rescue JSON::ParserError
232
311
  end
@@ -303,6 +303,17 @@ class KubernetesMetadataFilterTest < Test::Unit::TestCase
303
303
  assert_equal(msg.merge(json_log), es.instance_variable_get(:@record_array)[0])
304
304
  end
305
305
 
306
+ test 'merges json log data in MESSAGE' do
307
+ json_log = {
308
+ 'hello' => 'world'
309
+ }
310
+ msg = {
311
+ 'MESSAGE' => "#{json_log.to_json}"
312
+ }
313
+ es = emit_with_tag('non-kubernetes', msg, 'use_journal true')
314
+ assert_equal(msg.merge(json_log), es.instance_variable_get(:@record_array)[0])
315
+ end
316
+
306
317
  test 'merges json log data with message field' do
307
318
  json_log = {
308
319
  'timeMillis' => 1459853347608,
@@ -320,6 +331,23 @@ class KubernetesMetadataFilterTest < Test::Unit::TestCase
320
331
  assert_equal(msg.merge(json_log), es.instance_variable_get(:@record_array)[0])
321
332
  end
322
333
 
334
+ test 'merges json log data with message field in MESSAGE' do
335
+ json_log = {
336
+ 'timeMillis' => 1459853347608,
337
+ 'thread' => 'main',
338
+ 'level' => 'INFO',
339
+ 'loggerName' => 'org.apache.camel.spring.SpringCamelContext',
340
+ 'message' => 'Total 1 routes, of which 1 is started.',
341
+ 'endOfBatch' => false,
342
+ 'loggerFqcn' => 'org.apache.logging.slf4j.Log4jLogger'
343
+ }
344
+ msg = {
345
+ 'MESSAGE' => "#{json_log.to_json}"
346
+ }
347
+ es = emit_with_tag('non-kubernetes', msg, 'use_journal true')
348
+ assert_equal(msg.merge(json_log), es.instance_variable_get(:@record_array)[0])
349
+ end
350
+
323
351
  test 'emit individual fields from json, throw out whole original string' do
324
352
  json_log = {
325
353
  'hello' => 'world',
@@ -332,6 +360,21 @@ class KubernetesMetadataFilterTest < Test::Unit::TestCase
332
360
  assert_equal(json_log, es.instance_variable_get(:@record_array)[0])
333
361
  end
334
362
 
363
+ test 'emit individual fields from json, throw out whole original string in MESSAGE' do
364
+ json_log = {
365
+ 'hello' => 'world',
366
+ 'more' => 'data'
367
+ }
368
+ msg = {
369
+ 'MESSAGE' => "#{json_log.to_json}"
370
+ }
371
+ es = emit_with_tag('non-kubernetes', msg, '
372
+ preserve_json_log false
373
+ use_journal true
374
+ ')
375
+ assert_equal(json_log, es.instance_variable_get(:@record_array)[0])
376
+ end
377
+
335
378
  test 'with kubernetes dotted labels, de_dot enabled' do
336
379
  VCR.use_cassette('kubernetes_docker_metadata_dotted_labels') do
337
380
  es = emit()
@@ -388,5 +431,76 @@ class KubernetesMetadataFilterTest < Test::Unit::TestCase
388
431
  ')
389
432
  end
390
433
  end
434
+
435
+ test 'with records from journald and docker & kubernetes metadata' do
436
+ # with use_journal true should ignore tags and use CONTAINER_NAME and CONTAINER_ID_FULL
437
+ tag = 'var.log.containers.junk1_junk2_junk3-49095a2894da899d3b327c5fde1e056a81376cc9a8f8b09a195f2a92bceed450.log'
438
+ msg = {
439
+ 'CONTAINER_NAME' => 'k8s_fabric8-console-container.db89db89_fabric8-console-controller-98rqc_default_c76927af-f563-11e4-b32d-54ee7527188d_89db89db',
440
+ 'CONTAINER_ID_FULL' => '49095a2894da899d3b327c5fde1e056a81376cc9a8f8b09a195f2a92bceed459',
441
+ 'randomfield' => 'randomvalue'
442
+ }
443
+ VCR.use_cassette('kubernetes_docker_metadata') do
444
+ es = emit_with_tag(tag, msg, '
445
+ kubernetes_url https://localhost:8443
446
+ watch false
447
+ cache_size 1
448
+ use_journal true
449
+ ')
450
+ expected_kube_metadata = {
451
+ 'docker' => {
452
+ 'container_id' => '49095a2894da899d3b327c5fde1e056a81376cc9a8f8b09a195f2a92bceed459'
453
+ },
454
+ 'kubernetes' => {
455
+ 'host' => 'jimmi-redhat.localnet',
456
+ 'pod_name' => 'fabric8-console-controller-98rqc',
457
+ 'container_name' => 'fabric8-console-container',
458
+ 'namespace_name' => 'default',
459
+ 'pod_id' => 'c76927af-f563-11e4-b32d-54ee7527188d',
460
+ 'labels' => {
461
+ 'component' => 'fabric8Console'
462
+ }
463
+ }
464
+ }.merge(msg)
465
+ assert_equal(expected_kube_metadata, es.instance_variable_get(:@record_array)[0])
466
+ end
467
+ end
468
+
469
+ test 'with records from journald and docker & kubernetes metadata & namespace_id enabled' do
470
+ # with use_journal true should ignore tags and use CONTAINER_NAME and CONTAINER_ID_FULL
471
+ tag = 'var.log.containers.junk1_junk2_junk3-49095a2894da899d3b327c5fde1e056a81376cc9a8f8b09a195f2a92bceed450.log'
472
+ msg = {
473
+ 'CONTAINER_NAME' => 'k8s_fabric8-console-container.db89db89_fabric8-console-controller-98rqc_default_c76927af-f563-11e4-b32d-54ee7527188d_89db89db',
474
+ 'CONTAINER_ID_FULL' => '49095a2894da899d3b327c5fde1e056a81376cc9a8f8b09a195f2a92bceed459',
475
+ 'randomfield' => 'randomvalue'
476
+ }
477
+ VCR.use_cassette('metadata_with_namespace_id') do
478
+ es = emit_with_tag(tag, msg, '
479
+ kubernetes_url https://localhost:8443
480
+ watch false
481
+ cache_size 1
482
+ include_namespace_id true
483
+ use_journal true
484
+ ')
485
+ expected_kube_metadata = {
486
+ 'docker' => {
487
+ 'container_id' => '49095a2894da899d3b327c5fde1e056a81376cc9a8f8b09a195f2a92bceed459'
488
+ },
489
+ 'kubernetes' => {
490
+ 'host' => 'jimmi-redhat.localnet',
491
+ 'pod_name' => 'fabric8-console-controller-98rqc',
492
+ 'container_name' => 'fabric8-console-container',
493
+ 'namespace_name' => 'default',
494
+ 'namespace_id' => '898268c8-4a36-11e5-9d81-42010af0194c',
495
+ 'pod_id' => 'c76927af-f563-11e4-b32d-54ee7527188d',
496
+ 'labels' => {
497
+ 'component' => 'fabric8Console'
498
+ }
499
+ }
500
+ }.merge(msg)
501
+ assert_equal(expected_kube_metadata, es.instance_variable_get(:@record_array)[0])
502
+ end
503
+ end
504
+
391
505
  end
392
506
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-kubernetes_metadata_filter
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.21.0
4
+ version: 0.22.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jimmi Dyson
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2016-04-22 00:00:00.000000000 Z
11
+ date: 2016-06-01 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: fluentd