fluent-plugin-kubernetes_metadata_filter 0.21.0 → 0.22.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: b20a61e09187754c82416670f602ee2588399078
4
- data.tar.gz: ff91664e2431bf6d419b9a9a4ef04367ce0d8dae
3
+ metadata.gz: 300f1d004ac647a2b8ecbbd255b25ccc37865794
4
+ data.tar.gz: 871d3ccd62bc3d9129fd3b9b78858022aa576d40
5
5
  SHA512:
6
- metadata.gz: 10d00af65128a6b779fb8aab70a39e0ac20d8cf6b3030514eec564d66e3297e04ea43689cf7296e55ab2174332c0d9aac11425bcc506459798c0820cc45ec4fa
7
- data.tar.gz: 2bbd2911b66686211fefcf31f62a076b24b37b3784cf4998f1d5f7e5c9238adc2a1752dd8d9a3645762fbf912f787da3941d84b30d8056a9220ce04d24ed35b6
6
+ metadata.gz: a64f8fc1a0511df3df1afe86c33150b99f3636e95d91e2f385d49fdea82ec7e2d446cc2fe4f1e7cd2d26eb7d47466214c79d3ad6b6f874f0b5ca85d875972e1e
7
+ data.tar.gz: d6a913cd9c1a0ee7def06ef7ef67b008a3a5c3086e92a65d89b5f3fe9e36e26d4507b17484cc3cf421b7d26d56324fd34a7ee6a9d1aaef1b7ca4d0074d5d82d7
data/README.md CHANGED
@@ -27,7 +27,10 @@ This must used named capture groups for `container_name`, `pod_name` & `namespac
27
27
  * `preserve_json_log` - preserve JSON logs in raw form in the `log` key, only used if the previous option is true (default: `true`)
28
28
  * `de_dot` - replace dots in labels with configured `de_dot_separator`, required for ElasticSearch 2.x compatibility (default: `true`)
29
29
  * `de_dot_separator` - separator to use if `de_dot` is enabled (default: `_`)
30
+ * `use_journal` - If false (default), messages are expected to be formatted and tagged as if read by the fluentd in\_tail plugin with wildcard filename. If true, messages are expected to be formatted as if read from the systemd journal. The `MESSAGE` field has the full message. The `CONTAINER_NAME` field has the encoded k8s metadata (see below). The `CONTAINER_ID_FULL` field has the full container uuid. This requires docker to use the `--log-driver=journald` log driver.
31
+ * `container_name_to_kubernetes_regexp` - The regular expression used to extract the k8s metadata encoded in the journal `CONTAINER_NAME` field (default: `'^k8s_(?<container_name>[^\.]+)\.(?<container_hash>[a-z0-9]{8})_(?<pod_name>[^_]+)_(?<namespace>[^_]+)_(?<pod_id>[^_]+)_(?<pod_randhex>[a-z0-9]{8})$'`)
30
32
 
33
+ Reading from the JSON formatted log files with `in_tail` and wildcard filenames:
31
34
  ```
32
35
  <source>
33
36
  type tail
@@ -48,11 +51,39 @@ This must used named capture groups for `container_name`, `pod_name` & `namespac
48
51
  </match>
49
52
  ```
50
53
 
54
+ Reading from the systemd journal (requires the fluentd `fluent-plugin-systemd` and `systemd-journal` plugins, and requires docker to use the `--log-driver=journald` log driver):
55
+ ```
56
+ <source>
57
+ type systemd
58
+ path /run/log/journal
59
+ pos_file journal.pos
60
+ tag journal
61
+ read_from_head true
62
+ </source>
63
+
64
+ # probably want to use something like fluent-plugin-rewrite-tag-filter to
65
+ # retag entries from k8s
66
+ <match journal>
67
+ @type rewrite_tag_filter
68
+ rewriterule1 CONTAINER_NAME ^k8s_ kubernetes.journal.container
69
+ ...
70
+ </match>
71
+
72
+ <filter kubernetes.**>
73
+ type kubernetes_metadata
74
+ use_journal true
75
+ </filter>
76
+
77
+ <match **>
78
+ type stdout
79
+ </match>
80
+ ```
81
+
51
82
  ## Example input/output
52
83
 
53
84
  Kubernetes creates symlinks to Docker log files in `/var/log/containers/*.log`. Docker logs in JSON format.
54
85
 
55
- Assuming following inputs are coming from a log file:
86
+ Assuming following inputs are coming from a log file named `/var/log/containers/fabric8-console-controller-98rqc_default_fabric8-console-container-df14e0d5ae4c07284fa636d739c8fc2e6b52bc344658de7d3f08c36a2e804115.log`:
56
87
 
57
88
  ```
58
89
  {
@@ -73,9 +104,10 @@ Then output becomes as belows
73
104
  "kubernetes": {
74
105
  "host": "jimmi-redhat.localnet",
75
106
  "pod_name":"fabric8-console-controller-98rqc",
107
+ "pod_id": "c76927af-f563-11e4-b32d-54ee7527188d",
76
108
  "container_name": "fabric8-console-container",
77
- "namespace": "default",
78
- "uid": "c76927af-f563-11e4-b32d-54ee7527188d",
109
+ "namespace_name": "default",
110
+ "namespace_id": "23437884-8e08-4d95-850b-e94378c9b2fd",
79
111
  "labels": {
80
112
  "component": "fabric8Console"
81
113
  }
@@ -83,6 +115,18 @@ Then output becomes as belows
83
115
  }
84
116
  ```
85
117
 
118
+ If using journal input, from docker configured with `--log-driver=journald`, the input looks like the `journalctl -o export` format:
119
+ ```
120
+ # The stream identification is encoded into the PRIORITY field as an
121
+ # integer: 6, or github.com/coreos/go-systemd/journal.Info, marks stdout,
122
+ # while 3, or github.com/coreos/go-systemd/journal.Err, marks stderr.
123
+ PRIORITY=6
124
+ CONTAINER_ID=b6cbb6e73c0a
125
+ CONTAINER_ID_FULL=b6cbb6e73c0ad63ab820e4baa97cdc77cec729930e38a714826764ac0491341a
126
+ CONTAINER_NAME=k8s_registry.a49f5318_docker-registry-1-hhoj0_default_ae3a9bdc-1f66-11e6-80a2-fa163e2fff3a_799e4035
127
+ MESSAGE=172.17.0.1 - - [21/May/2016:16:52:05 +0000] "GET /healthz HTTP/1.1" 200 0 "" "Go-http-client/1.1"
128
+ ```
129
+
86
130
  ## Contributing
87
131
 
88
132
  1. Fork it
data/Rakefile CHANGED
@@ -13,8 +13,8 @@ Rake::TestTask.new(:base_test) do |t|
13
13
  # $ bundle exec rake base_test TEST=test/test_*.rb
14
14
  t.libs << 'test'
15
15
  t.test_files = Dir['test/**/test_*.rb'].sort
16
- t.verbose = true
17
- #t.warning = true
16
+ #t.verbose = true
17
+ t.warning = false
18
18
  end
19
19
 
20
20
  desc 'Add copyright headers'
@@ -4,7 +4,7 @@ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
4
 
5
5
  Gem::Specification.new do |gem|
6
6
  gem.name = "fluent-plugin-kubernetes_metadata_filter"
7
- gem.version = "0.21.0"
7
+ gem.version = "0.22.0"
8
8
  gem.authors = ["Jimmi Dyson"]
9
9
  gem.email = ["jimmidyson@gmail.com"]
10
10
  gem.description = %q{Filter plugin to add Kubernetes metadata}
@@ -42,6 +42,14 @@ module Fluent
42
42
  config_param :secret_dir, :string, default: '/var/run/secrets/kubernetes.io/serviceaccount'
43
43
  config_param :de_dot, :bool, default: true
44
44
  config_param :de_dot_separator, :string, default: '_'
45
+ # if reading from the journal, the record will contain the following fields in the following
46
+ # format:
47
+ # CONTAINER_NAME=k8s_$containername.$containerhash_$podname_$namespacename_$poduuid_$rand32bitashex
48
+ # CONTAINER_FULL_ID=dockeridassha256hexvalue
49
+ config_param :use_journal, :bool, default: false
50
+ config_param :container_name_to_kubernetes_regexp,
51
+ :string,
52
+ :default => '^k8s_(?<container_name>[^\.]+)\.(?<container_hash>[a-z0-9]{8})_(?<pod_name>[^_]+)_(?<namespace>[^_]+)_(?<pod_id>[^_]+)_(?<pod_randhex>[a-z0-9]{8})$'
45
53
 
46
54
  def syms_to_strs(hsh)
47
55
  newhsh = {}
@@ -102,6 +110,7 @@ module Fluent
102
110
  @namespace_cache = LruRedux::TTL::ThreadSafeCache.new(@cache_size, @cache_ttl)
103
111
  end
104
112
  @tag_to_kubernetes_name_regexp_compiled = Regexp.compile(@tag_to_kubernetes_name_regexp)
113
+ @container_name_to_kubernetes_regexp_compiled = Regexp.compile(@container_name_to_kubernetes_regexp)
105
114
 
106
115
  # Use Kubernetes default service account if we're in a pod.
107
116
  if @kubernetes_url.nil?
@@ -161,25 +170,30 @@ module Fluent
161
170
  end
162
171
  end
163
172
  end
173
+ if @use_journal
174
+ @merge_json_log_key = 'MESSAGE'
175
+ self.class.class_eval { alias_method :filter_stream, :filter_stream_from_journal }
176
+ else
177
+ @merge_json_log_key = 'log'
178
+ self.class.class_eval { alias_method :filter_stream, :filter_stream_from_files }
179
+ end
164
180
  end
165
181
 
166
- # NOTE: fluentd requires that records/hashes have string keys, not symbol keys
167
- # http://docs.fluentd.org/articles/plugin-development#record-format
168
- def filter_stream(tag, es)
182
+ def filter_stream_from_files(tag, es)
169
183
  new_es = MultiEventStream.new
170
184
 
171
185
  match_data = tag.match(@tag_to_kubernetes_name_regexp_compiled)
172
186
 
173
187
  if match_data
174
188
  metadata = {
175
- 'docker' => {
176
- 'container_id' => match_data['docker_id']
177
- },
178
- 'kubernetes' => {
179
- 'namespace_name' => match_data['namespace'],
180
- 'pod_name' => match_data['pod_name'],
181
- 'container_name' => match_data['container_name']
182
- }
189
+ 'docker' => {
190
+ 'container_id' => match_data['docker_id']
191
+ },
192
+ 'kubernetes' => {
193
+ 'namespace_name' => match_data['namespace'],
194
+ 'pod_name' => match_data['pod_name'],
195
+ 'container_name' => match_data['container_name']
196
+ }
183
197
  }
184
198
 
185
199
  if @kubernetes_url.present?
@@ -219,14 +233,79 @@ module Fluent
219
233
  new_es
220
234
  end
221
235
 
236
+ def filter_stream_from_journal(tag, es)
237
+ new_es = MultiEventStream.new
238
+
239
+ es.each { |time, record|
240
+ record = merge_json_log(record) if @merge_json_log
241
+
242
+ metadata = nil
243
+ if record.has_key?('CONTAINER_NAME') && record.has_key?('CONTAINER_ID_FULL')
244
+ metadata = record['CONTAINER_NAME'].match(@container_name_to_kubernetes_regexp_compiled) do |match_data|
245
+ metadata = {
246
+ 'docker' => {
247
+ 'container_id' => record['CONTAINER_ID_FULL']
248
+ },
249
+ 'kubernetes' => {
250
+ 'namespace_name' => match_data['namespace'],
251
+ 'pod_name' => match_data['pod_name'],
252
+ 'container_name' => match_data['container_name']
253
+ }
254
+ }
255
+ if @kubernetes_url.present?
256
+ cache_key = "#{metadata['kubernetes']['namespace_name']}_#{metadata['kubernetes']['pod_name']}_#{metadata['kubernetes']['container_name']}"
257
+
258
+ this = self
259
+ metadata = @cache.getset(cache_key) {
260
+ if metadata
261
+ kubernetes_metadata = this.get_metadata(
262
+ metadata['kubernetes']['namespace_name'],
263
+ metadata['kubernetes']['pod_name'],
264
+ metadata['kubernetes']['container_name']
265
+ )
266
+ metadata['kubernetes'] = kubernetes_metadata if kubernetes_metadata
267
+ metadata
268
+ end
269
+ }
270
+ if match_data['pod_id'] && (match_data['pod_id'] != metadata['kubernetes']['pod_id'])
271
+ log.debug("pod_id #{match_data['pod_id']} from log not equal to pod_id #{metadata['kubernetes']['pod_id']} from kubernetes for #{cache_key}")
272
+ end
273
+ if @include_namespace_id
274
+ namespace_name = metadata['kubernetes']['namespace_name']
275
+ namespace_id = @namespace_cache.getset(namespace_name) {
276
+ namespace = @client.get_namespace(namespace_name)
277
+ namespace['metadata']['uid'] if namespace
278
+ }
279
+ metadata['kubernetes']['namespace_id'] = namespace_id if namespace_id
280
+ end
281
+ end
282
+ metadata
283
+ end
284
+ unless metadata
285
+ log.debug "Error: could not match CONTAINER_NAME from record #{record}"
286
+ end
287
+ elsif record.has_key?('CONTAINER_NAME') && record['CONTAINER_NAME'].start_with?('k8s_')
288
+ log.debug "Error: no container name and id in record #{record}"
289
+ end
290
+
291
+ if metadata
292
+ record = record.merge(metadata)
293
+ end
294
+
295
+ new_es.add(time, record)
296
+ }
297
+
298
+ new_es
299
+ end
300
+
222
301
  def merge_json_log(record)
223
- if record.has_key?('log')
224
- log = record['log'].strip
302
+ if record.has_key?(@merge_json_log_key)
303
+ log = record[@merge_json_log_key].strip
225
304
  if log[0].eql?('{') && log[-1].eql?('}')
226
305
  begin
227
306
  record = JSON.parse(log).merge(record)
228
307
  unless @preserve_json_log
229
- record.delete('log')
308
+ record.delete(@merge_json_log_key)
230
309
  end
231
310
  rescue JSON::ParserError
232
311
  end
@@ -303,6 +303,17 @@ class KubernetesMetadataFilterTest < Test::Unit::TestCase
303
303
  assert_equal(msg.merge(json_log), es.instance_variable_get(:@record_array)[0])
304
304
  end
305
305
 
306
+ test 'merges json log data in MESSAGE' do
307
+ json_log = {
308
+ 'hello' => 'world'
309
+ }
310
+ msg = {
311
+ 'MESSAGE' => "#{json_log.to_json}"
312
+ }
313
+ es = emit_with_tag('non-kubernetes', msg, 'use_journal true')
314
+ assert_equal(msg.merge(json_log), es.instance_variable_get(:@record_array)[0])
315
+ end
316
+
306
317
  test 'merges json log data with message field' do
307
318
  json_log = {
308
319
  'timeMillis' => 1459853347608,
@@ -320,6 +331,23 @@ class KubernetesMetadataFilterTest < Test::Unit::TestCase
320
331
  assert_equal(msg.merge(json_log), es.instance_variable_get(:@record_array)[0])
321
332
  end
322
333
 
334
+ test 'merges json log data with message field in MESSAGE' do
335
+ json_log = {
336
+ 'timeMillis' => 1459853347608,
337
+ 'thread' => 'main',
338
+ 'level' => 'INFO',
339
+ 'loggerName' => 'org.apache.camel.spring.SpringCamelContext',
340
+ 'message' => 'Total 1 routes, of which 1 is started.',
341
+ 'endOfBatch' => false,
342
+ 'loggerFqcn' => 'org.apache.logging.slf4j.Log4jLogger'
343
+ }
344
+ msg = {
345
+ 'MESSAGE' => "#{json_log.to_json}"
346
+ }
347
+ es = emit_with_tag('non-kubernetes', msg, 'use_journal true')
348
+ assert_equal(msg.merge(json_log), es.instance_variable_get(:@record_array)[0])
349
+ end
350
+
323
351
  test 'emit individual fields from json, throw out whole original string' do
324
352
  json_log = {
325
353
  'hello' => 'world',
@@ -332,6 +360,21 @@ class KubernetesMetadataFilterTest < Test::Unit::TestCase
332
360
  assert_equal(json_log, es.instance_variable_get(:@record_array)[0])
333
361
  end
334
362
 
363
+ test 'emit individual fields from json, throw out whole original string in MESSAGE' do
364
+ json_log = {
365
+ 'hello' => 'world',
366
+ 'more' => 'data'
367
+ }
368
+ msg = {
369
+ 'MESSAGE' => "#{json_log.to_json}"
370
+ }
371
+ es = emit_with_tag('non-kubernetes', msg, '
372
+ preserve_json_log false
373
+ use_journal true
374
+ ')
375
+ assert_equal(json_log, es.instance_variable_get(:@record_array)[0])
376
+ end
377
+
335
378
  test 'with kubernetes dotted labels, de_dot enabled' do
336
379
  VCR.use_cassette('kubernetes_docker_metadata_dotted_labels') do
337
380
  es = emit()
@@ -388,5 +431,76 @@ class KubernetesMetadataFilterTest < Test::Unit::TestCase
388
431
  ')
389
432
  end
390
433
  end
434
+
435
+ test 'with records from journald and docker & kubernetes metadata' do
436
+ # with use_journal true should ignore tags and use CONTAINER_NAME and CONTAINER_ID_FULL
437
+ tag = 'var.log.containers.junk1_junk2_junk3-49095a2894da899d3b327c5fde1e056a81376cc9a8f8b09a195f2a92bceed450.log'
438
+ msg = {
439
+ 'CONTAINER_NAME' => 'k8s_fabric8-console-container.db89db89_fabric8-console-controller-98rqc_default_c76927af-f563-11e4-b32d-54ee7527188d_89db89db',
440
+ 'CONTAINER_ID_FULL' => '49095a2894da899d3b327c5fde1e056a81376cc9a8f8b09a195f2a92bceed459',
441
+ 'randomfield' => 'randomvalue'
442
+ }
443
+ VCR.use_cassette('kubernetes_docker_metadata') do
444
+ es = emit_with_tag(tag, msg, '
445
+ kubernetes_url https://localhost:8443
446
+ watch false
447
+ cache_size 1
448
+ use_journal true
449
+ ')
450
+ expected_kube_metadata = {
451
+ 'docker' => {
452
+ 'container_id' => '49095a2894da899d3b327c5fde1e056a81376cc9a8f8b09a195f2a92bceed459'
453
+ },
454
+ 'kubernetes' => {
455
+ 'host' => 'jimmi-redhat.localnet',
456
+ 'pod_name' => 'fabric8-console-controller-98rqc',
457
+ 'container_name' => 'fabric8-console-container',
458
+ 'namespace_name' => 'default',
459
+ 'pod_id' => 'c76927af-f563-11e4-b32d-54ee7527188d',
460
+ 'labels' => {
461
+ 'component' => 'fabric8Console'
462
+ }
463
+ }
464
+ }.merge(msg)
465
+ assert_equal(expected_kube_metadata, es.instance_variable_get(:@record_array)[0])
466
+ end
467
+ end
468
+
469
+ test 'with records from journald and docker & kubernetes metadata & namespace_id enabled' do
470
+ # with use_journal true should ignore tags and use CONTAINER_NAME and CONTAINER_ID_FULL
471
+ tag = 'var.log.containers.junk1_junk2_junk3-49095a2894da899d3b327c5fde1e056a81376cc9a8f8b09a195f2a92bceed450.log'
472
+ msg = {
473
+ 'CONTAINER_NAME' => 'k8s_fabric8-console-container.db89db89_fabric8-console-controller-98rqc_default_c76927af-f563-11e4-b32d-54ee7527188d_89db89db',
474
+ 'CONTAINER_ID_FULL' => '49095a2894da899d3b327c5fde1e056a81376cc9a8f8b09a195f2a92bceed459',
475
+ 'randomfield' => 'randomvalue'
476
+ }
477
+ VCR.use_cassette('metadata_with_namespace_id') do
478
+ es = emit_with_tag(tag, msg, '
479
+ kubernetes_url https://localhost:8443
480
+ watch false
481
+ cache_size 1
482
+ include_namespace_id true
483
+ use_journal true
484
+ ')
485
+ expected_kube_metadata = {
486
+ 'docker' => {
487
+ 'container_id' => '49095a2894da899d3b327c5fde1e056a81376cc9a8f8b09a195f2a92bceed459'
488
+ },
489
+ 'kubernetes' => {
490
+ 'host' => 'jimmi-redhat.localnet',
491
+ 'pod_name' => 'fabric8-console-controller-98rqc',
492
+ 'container_name' => 'fabric8-console-container',
493
+ 'namespace_name' => 'default',
494
+ 'namespace_id' => '898268c8-4a36-11e5-9d81-42010af0194c',
495
+ 'pod_id' => 'c76927af-f563-11e4-b32d-54ee7527188d',
496
+ 'labels' => {
497
+ 'component' => 'fabric8Console'
498
+ }
499
+ }
500
+ }.merge(msg)
501
+ assert_equal(expected_kube_metadata, es.instance_variable_get(:@record_array)[0])
502
+ end
503
+ end
504
+
391
505
  end
392
506
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-kubernetes_metadata_filter
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.21.0
4
+ version: 0.22.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jimmi Dyson
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2016-04-22 00:00:00.000000000 Z
11
+ date: 2016-06-01 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: fluentd