fluent-plugin-prometheus 1.3.0 → 1.7.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 274c9ef3f67ca084c8b46d6a24ca97ce8313d2ade4409426dc2bb664ec227f31
4
- data.tar.gz: 1c3cc72e17e834cf5144315086856bc8a8394a1a893771f7521e0a0fd4adf73f
3
+ metadata.gz: 95008f0031db355e5a5bbf4ed21f8f9f0a07839480e40bf44e974046b04f562f
4
+ data.tar.gz: 994a30da7b1ff0ad77f12e5b47d9218a2deb97effc31a36fa83175077cc6ed69
5
5
  SHA512:
6
- metadata.gz: 791e3bf0e513d98d32f0bb8a0f936e305f373ae2706fd9ad6bd3cdd12eb9adcd288d811ea3bcc9eb06c50b7eb583ec5edda3fb52751f7def52252c1174780407
7
- data.tar.gz: 2aa16a95bedb91f4f04bac51a8fe96ee540aefe13532429d12803f97f0537e9fd3c10bb54a37c8145f8a58fe0e06e7975d7e99e1a8addb686eb12311ab9951c5
6
+ metadata.gz: e0dc26e78e242e51e0fa87257b85c68af2ae32c11f7dabd1b4245e3ad7e1b1a7182b1b8432578f98af8a5a22d14c14e664285a238f74ebc40df0c0461026f323
7
+ data.tar.gz: fb31a79f327ce31e8203fa1fa9a5b9d9166410d71f9b62d4fd5411d246c768dfde5c9c076ff6e72f0d9f5bd4c7657c9180eb91ba4ee383729856da0704930404
@@ -1,9 +1,9 @@
1
1
  language: ruby
2
2
 
3
3
  rvm:
4
- - "2.3.6"
5
- - "2.4.3"
6
- - "2.5.0"
4
+ - "2.4.7"
5
+ - "2.5"
6
+ - "2.6"
7
7
 
8
8
  gemfile:
9
9
  - Gemfile
data/README.md CHANGED
@@ -58,18 +58,25 @@ More configuration parameters:
58
58
  - `bind`: binding interface (default: '0.0.0.0')
59
59
  - `port`: listen port (default: 24231)
60
60
  - `metrics_path`: metrics HTTP endpoint (default: /metrics)
61
+ - `aggregated_metrics_path`: metrics HTTP endpoint (default: /aggregated_metrics)
61
62
 
62
63
  When using multiple workers, each worker binds to port + `fluent_worker_id`.
64
+ To scrape metrics from all workers at once, you can access http://localhost:24231/aggregated_metrics.
63
65
 
64
66
  ### prometheus_monitor input plugin
65
67
 
66
- This plugin collects internal metrics in Fluentd. The metrics are similar to/part of [monitor_agent](http://docs.fluentd.org/articles/monitoring#monitoring-agent).
68
+ This plugin collects internal metrics in Fluentd. The metrics are similar to/part of [monitor_agent](https://docs.fluentd.org/input/monitor_agent).
67
69
 
68
- Current exposed metrics:
69
70
 
70
- - `buffere_queue_length` of each BufferedOutput plugins
71
- - `buffer_total_queued_size` of each BufferedOutput plugins
72
- - `retry_count` of each BufferedOutput plugins
71
+ #### Exposed metrics
72
+
73
+ - `fluentd_status_buffer_queue_length`
74
+ - `fluentd_status_buffer_total_queued_size`
75
+ - `fluentd_status_retry_count`
76
+ - `fluentd_status_buffer_newest_timekey` from fluentd v1.4.2
77
+ - `fluentd_status_buffer_oldest_timekey` from fluentd v1.4.2
78
+
79
+ #### Configuration
73
80
 
74
81
  With following configuration, those metrics are collected.
75
82
 
@@ -86,26 +93,35 @@ More configuration parameters:
86
93
 
87
94
  ### prometheus_output_monitor input plugin
88
95
 
89
- **experimental**
90
-
91
96
  This plugin collects internal metrics for output plugin in Fluentd. This is similar to `prometheus_monitor` plugin, but specialized for output plugin. There are Many metrics `prometheus_monitor` does not include, such as `num_errors`, `retry_wait` and so on.
92
97
 
93
- Current exposed metrics:
98
+ #### Exposed metrics
99
+
100
+ Metrics for output
94
101
 
95
- - `fluentd_output_status_buffer_queue_length`
96
- - `fluentd_output_status_buffer_total_bytes`
97
102
  - `fluentd_output_status_retry_count`
98
103
  - `fluentd_output_status_num_errors`
99
104
  - `fluentd_output_status_emit_count`
100
105
  - `fluentd_output_status_retry_wait`
101
106
  - current retry_wait computed from last retry time and next retry time
102
107
  - `fluentd_output_status_emit_records`
103
- - only for v0.14
104
108
  - `fluentd_output_status_write_count`
105
- - only for v0.14
106
109
  - `fluentd_output_status_rollback_count`
107
- - only for v0.14
110
+ - `fluentd_output_status_flush_time_count` from fluentd v1.6.0
111
+ - `fluentd_output_status_slow_flush_count` from fluentd v1.6.0
108
112
 
113
+ Metrics for buffer
114
+
115
+ - `fluentd_output_status_buffer_total_bytes`
116
+ - `fluentd_output_status_buffer_stage_length` from fluentd v1.6.0
117
+ - `fluentd_output_status_buffer_stage_byte_size` from fluentd v1.6.0
118
+ - `fluentd_output_status_buffer_queue_length`
119
+ - `fluentd_output_status_buffer_queue_byte_size` from fluentd v1.6.0
120
+ - `fluentd_output_status_buffer_newest_timekey` from fluentd v1.6.0
121
+ - `fluentd_output_status_buffer_oldest_timekey` from fluentd v1.6.0
122
+ - `fluentd_output_status_buffer_available_space_ratio` from fluentd v1.6.0
123
+
124
+ #### Configuration
109
125
 
110
126
  With following configuration, those metrics are collected.
111
127
 
@@ -122,13 +138,11 @@ More configuration parameters:
122
138
 
123
139
  ### prometheus_tail_monitor input plugin
124
140
 
125
- **experimental**
126
-
127
141
  This plugin collects internal metrics for in_tail plugin in Fluentd. in_tail plugin holds internal state for files that the plugin is watching. The state is sometimes important to monitor plugins work correctly.
128
142
 
129
143
  This plugin uses internal class of Fluentd, so it's easy to break.
130
144
 
131
- Current exposed metrics:
145
+ #### Exposed metrics
132
146
 
133
147
  - `fluentd_tail_file_position`
134
148
  - Current bytes which plugin reads from the file
@@ -141,6 +155,8 @@ Default labels:
141
155
  - `type`: plugin name. `in_tail` only for now.
142
156
  - `path`: file path
143
157
 
158
+ #### Configuration
159
+
144
160
  With following configuration, those metrics are collected.
145
161
 
146
162
  ```
@@ -224,7 +240,7 @@ In output plugin style:
224
240
 
225
241
  With above configuration, the plugin collects a metric named `message_foo_counter` from key `foo` of each records.
226
242
 
227
- You can access nested keys in records via dot or bracket notation (https://docs.fluentd.org/v1.0/articles/api-plugin-helper-record_accessor#syntax), for example: `$.kubernetes.namespace`, `$['key1'][0]['key2']`. The record accessor is enable only if the value starts with `$.` or `$[`.
243
+ You can access nested keys in records via dot or bracket notation (https://docs.fluentd.org/plugin-helper-overview/api-plugin-helper-record_accessor#syntax), for example: `$.kubernetes.namespace`, `$['key1'][0]['key2']`. The record accessor is enable only if the value starts with `$.` or `$[`.
228
244
 
229
245
  See Supported Metric Type and Labels for more configuration parameters.
230
246
 
@@ -341,7 +357,7 @@ You can add labels with static value or dynamic value from records. In `promethe
341
357
 
342
358
  All labels sections has same format. Each lines have key/value for label.
343
359
 
344
- You can access nested fields in records via dot or bracket notation (https://docs.fluentd.org/v1.0/articles/api-plugin-helper-record_accessor#syntax), for example: `$.kubernetes.namespace`, `$['key1'][0]['key2']`. The record accessor is enable only if the value starts with `$.` or `$[`. Other values are handled as raw string as is and may be expanded by placeholder described later.
360
+ You can access nested fields in records via dot or bracket notation (https://docs.fluentd.org/plugin-helper-overview/api-plugin-helper-record_accessor#syntax), for example: `$.kubernetes.namespace`, `$['key1'][0]['key2']`. The record accessor is enable only if the value starts with `$.` or `$[`. Other values are handled as raw string as is and may be expanded by placeholder described later.
345
361
 
346
362
  You can use placeholder for label values. The placeholders will be expanded from reserved values and records.
347
363
  If you specify `${hostname}`, it will be expanded by value of a hostname where fluentd runs.
@@ -1,6 +1,6 @@
1
1
  Gem::Specification.new do |spec|
2
2
  spec.name = "fluent-plugin-prometheus"
3
- spec.version = "1.3.0"
3
+ spec.version = "1.7.0"
4
4
  spec.authors = ["Masahiro Sano"]
5
5
  spec.email = ["sabottenda@gmail.com"]
6
6
  spec.summary = %q{A fluent plugin that collects metrics and exposes for Prometheus.}
@@ -14,7 +14,7 @@ Gem::Specification.new do |spec|
14
14
  spec.require_paths = ["lib"]
15
15
 
16
16
  spec.add_dependency "fluentd", ">= 0.14.20", "< 2"
17
- spec.add_dependency "prometheus-client"
17
+ spec.add_dependency "prometheus-client", "< 0.10"
18
18
  spec.add_development_dependency "bundler"
19
19
  spec.add_development_dependency "rake"
20
20
  spec.add_development_dependency "rspec"
@@ -4,6 +4,7 @@ require 'fluent/plugin/filter'
4
4
  module Fluent::Plugin
5
5
  class PrometheusFilter < Fluent::Plugin::Filter
6
6
  Fluent::Plugin.register_filter('prometheus', self)
7
+ include Fluent::Plugin::PrometheusLabelParser
7
8
  include Fluent::Plugin::Prometheus
8
9
 
9
10
  def initialize
@@ -11,15 +12,19 @@ module Fluent::Plugin
11
12
  @registry = ::Prometheus::Client.registry
12
13
  end
13
14
 
15
+ def multi_workers_ready?
16
+ true
17
+ end
18
+
14
19
  def configure(conf)
15
20
  super
16
- labels = Fluent::Plugin::Prometheus.parse_labels_elements(conf)
21
+ labels = parse_labels_elements(conf)
17
22
  @metrics = Fluent::Plugin::Prometheus.parse_metrics_elements(conf, @registry, labels)
18
23
  end
19
24
 
20
- def filter_stream(tag, es)
21
- instrument(tag, es, @metrics)
22
- es
25
+ def filter(tag, time, record)
26
+ instrument_single(tag, time, record, @metrics)
27
+ record
23
28
  end
24
29
  end
25
30
  end
@@ -1,5 +1,7 @@
1
1
  require 'fluent/plugin/input'
2
2
  require 'fluent/plugin/prometheus'
3
+ require 'fluent/plugin/prometheus_metrics'
4
+ require 'net/http'
3
5
  require 'webrick'
4
6
 
5
7
  module Fluent::Plugin
@@ -11,6 +13,7 @@ module Fluent::Plugin
11
13
  config_param :bind, :string, default: '0.0.0.0'
12
14
  config_param :port, :integer, default: 24231
13
15
  config_param :metrics_path, :string, default: '/metrics'
16
+ config_param :aggregated_metrics_path, :string, default: '/aggregated_metrics'
14
17
 
15
18
  desc 'Enable ssl configuration for the server'
16
19
  config_section :ssl, required: false, multi: false do
@@ -31,6 +34,10 @@ module Fluent::Plugin
31
34
 
32
35
  attr_reader :registry
33
36
 
37
+ attr_reader :num_workers
38
+ attr_reader :base_port
39
+ attr_reader :metrics_path
40
+
34
41
  def initialize
35
42
  super
36
43
  @registry = ::Prometheus::Client.registry
@@ -38,6 +45,18 @@ module Fluent::Plugin
38
45
 
39
46
  def configure(conf)
40
47
  super
48
+
49
+ # Get how many workers we have
50
+ sysconf = if self.respond_to?(:owner) && owner.respond_to?(:system_config)
51
+ owner.system_config
52
+ elsif self.respond_to?(:system_config)
53
+ self.system_config
54
+ else
55
+ nil
56
+ end
57
+ @num_workers = sysconf && sysconf.workers ? sysconf.workers : 1
58
+
59
+ @base_port = @port
41
60
  @port += fluentd_worker_id
42
61
  end
43
62
 
@@ -82,6 +101,7 @@ module Fluent::Plugin
82
101
 
83
102
  @server = WEBrick::HTTPServer.new(config)
84
103
  @server.mount(@metrics_path, MonitorServlet, self)
104
+ @server.mount(@aggregated_metrics_path, MonitorServletAll, self)
85
105
  thread_create(:in_prometheus) do
86
106
  @server.start
87
107
  end
@@ -110,5 +130,35 @@ module Fluent::Plugin
110
130
  res.body = $!.to_s
111
131
  end
112
132
  end
133
+
134
+ class MonitorServletAll < WEBrick::HTTPServlet::AbstractServlet
135
+ def initialize(server, prometheus)
136
+ @prometheus = prometheus
137
+ end
138
+
139
+ def do_GET(req, res)
140
+ res.status = 200
141
+ res['Content-Type'] = ::Prometheus::Client::Formats::Text::CONTENT_TYPE
142
+
143
+ full_result = PromMetricsAggregator.new
144
+ fluent_server_ip = @prometheus.bind == '0.0.0.0' ? '127.0.0.1' : @prometheus.bind
145
+ current_worker = 0
146
+ while current_worker < @prometheus.num_workers
147
+ Net::HTTP.start(fluent_server_ip, @prometheus.base_port + current_worker) do |http|
148
+ req = Net::HTTP::Get.new(@prometheus.metrics_path)
149
+ result = http.request(req)
150
+ if result.is_a?(Net::HTTPSuccess)
151
+ full_result.add_metrics(result.body)
152
+ end
153
+ end
154
+ current_worker += 1
155
+ end
156
+ res.body = full_result.get_metrics
157
+ rescue
158
+ res.status = 500
159
+ res['Content-Type'] = 'text/plain'
160
+ res.body = $!.to_s
161
+ end
162
+ end
113
163
  end
114
164
  end
@@ -5,6 +5,7 @@ require 'fluent/plugin/prometheus'
5
5
  module Fluent::Plugin
6
6
  class PrometheusMonitorInput < Fluent::Plugin::Input
7
7
  Fluent::Plugin.register_input('prometheus_monitor', self)
8
+ include Fluent::Plugin::PrometheusLabelParser
8
9
 
9
10
  helpers :timer
10
11
 
@@ -25,11 +26,12 @@ module Fluent::Plugin
25
26
  hostname = Socket.gethostname
26
27
  expander = Fluent::Plugin::Prometheus.placeholder_expander(log)
27
28
  placeholders = expander.prepare_placeholders({'hostname' => hostname, 'worker_id' => fluentd_worker_id})
28
- @base_labels = Fluent::Plugin::Prometheus.parse_labels_elements(conf)
29
+ @base_labels = parse_labels_elements(conf)
29
30
  @base_labels.each do |key, value|
30
- if value.is_a?(String)
31
- @base_labels[key] = expander.expand(value, placeholders)
31
+ unless value.is_a?(String)
32
+ raise Fluent::ConfigError, "record accessor syntax is not available in prometheus_monitor"
32
33
  end
34
+ @base_labels[key] = expander.expand(value, placeholders)
33
35
  end
34
36
 
35
37
  if defined?(Fluent::Plugin) && defined?(Fluent::Plugin::MonitorAgentInput)
@@ -39,6 +41,17 @@ module Fluent::Plugin
39
41
  @monitor_agent = Fluent::MonitorAgentInput.new
40
42
  end
41
43
 
44
+ end
45
+
46
+ def start
47
+ super
48
+
49
+ @buffer_newest_timekey = @registry.gauge(
50
+ :fluentd_status_buffer_newest_timekey,
51
+ 'Newest timekey in buffer.')
52
+ @buffer_oldest_timekey = @registry.gauge(
53
+ :fluentd_status_buffer_oldest_timekey,
54
+ 'Oldest timekey in buffer.')
42
55
  buffer_queue_length = @registry.gauge(
43
56
  :fluentd_status_buffer_queue_length,
44
57
  'Current buffer queue length.')
@@ -54,20 +67,24 @@ module Fluent::Plugin
54
67
  'buffer_total_queued_size' => buffer_total_queued_size,
55
68
  'retry_count' => retry_counts,
56
69
  }
57
- end
58
-
59
- def start
60
- super
61
70
  timer_execute(:in_prometheus_monitor, @interval, &method(:update_monitor_info))
62
71
  end
63
72
 
64
73
  def update_monitor_info
65
74
  @monitor_agent.plugins_info_all.each do |info|
75
+ label = labels(info)
76
+
66
77
  @monitor_info.each do |name, metric|
67
78
  if info[name]
68
- metric.set(labels(info), info[name])
79
+ metric.set(label, info[name])
69
80
  end
70
81
  end
82
+
83
+ timekeys = info["buffer_timekeys"]
84
+ if timekeys && !timekeys.empty?
85
+ @buffer_newest_timekey.set(label, timekeys.max)
86
+ @buffer_oldest_timekey.set(label, timekeys.min)
87
+ end
71
88
  end
72
89
  end
73
90
 
@@ -5,6 +5,7 @@ require 'fluent/plugin/prometheus'
5
5
  module Fluent::Plugin
6
6
  class PrometheusOutputMonitorInput < Fluent::Input
7
7
  Fluent::Plugin.register_input('prometheus_output_monitor', self)
8
+ include Fluent::Plugin::PrometheusLabelParser
8
9
 
9
10
  helpers :timer
10
11
 
@@ -24,6 +25,10 @@ module Fluent::Plugin
24
25
  :emit_records,
25
26
  :write_count,
26
27
  :rollback_count,
28
+
29
+ # from v1.6.0
30
+ :flush_time_count,
31
+ :slow_flush_count,
27
32
  ]
28
33
 
29
34
  def initialize
@@ -40,8 +45,11 @@ module Fluent::Plugin
40
45
  hostname = Socket.gethostname
41
46
  expander = Fluent::Plugin::Prometheus.placeholder_expander(log)
42
47
  placeholders = expander.prepare_placeholders({'hostname' => hostname, 'worker_id' => fluentd_worker_id})
43
- @base_labels = Fluent::Plugin::Prometheus.parse_labels_elements(conf)
48
+ @base_labels = parse_labels_elements(conf)
44
49
  @base_labels.each do |key, value|
50
+ unless value.is_a?(String)
51
+ raise Fluent::ConfigError, "record accessor syntax is not available in prometheus_output_monitor"
52
+ end
45
53
  @base_labels[key] = expander.expand(value, placeholders)
46
54
  end
47
55
 
@@ -51,14 +59,39 @@ module Fluent::Plugin
51
59
  else
52
60
  @monitor_agent = Fluent::MonitorAgentInput.new
53
61
  end
62
+ end
63
+
64
+ def start
65
+ super
54
66
 
55
67
  @metrics = {
56
- buffer_queue_length: @registry.gauge(
57
- :fluentd_output_status_buffer_queue_length,
58
- 'Current buffer queue length.'),
68
+ # Buffer metrics
59
69
  buffer_total_queued_size: @registry.gauge(
60
70
  :fluentd_output_status_buffer_total_bytes,
61
- 'Current total size of queued buffers.'),
71
+ 'Current total size of stage and queue buffers.'),
72
+ buffer_stage_length: @registry.gauge(
73
+ :fluentd_output_status_buffer_stage_length,
74
+ 'Current length of stage buffers.'),
75
+ buffer_stage_byte_size: @registry.gauge(
76
+ :fluentd_output_status_buffer_stage_byte_size,
77
+ 'Current total size of stage buffers.'),
78
+ buffer_queue_length: @registry.gauge(
79
+ :fluentd_output_status_buffer_queue_length,
80
+ 'Current length of queue buffers.'),
81
+ buffer_queue_byte_size: @registry.gauge(
82
+ :fluentd_output_status_queue_byte_size,
83
+ 'Current total size of queue buffers.'),
84
+ buffer_available_buffer_space_ratios: @registry.gauge(
85
+ :fluentd_output_status_buffer_available_space_ratio,
86
+ 'Ratio of available space in buffer.'),
87
+ buffer_newest_timekey: @registry.gauge(
88
+ :fluentd_output_status_buffer_newest_timekey,
89
+ 'Newest timekey in buffer.'),
90
+ buffer_oldest_timekey: @registry.gauge(
91
+ :fluentd_output_status_buffer_oldest_timekey,
92
+ 'Oldest timekey in buffer.'),
93
+
94
+ # Output metrics
62
95
  retry_counts: @registry.gauge(
63
96
  :fluentd_output_status_retry_count,
64
97
  'Current retry counts.'),
@@ -77,14 +110,16 @@ module Fluent::Plugin
77
110
  rollback_count: @registry.gauge(
78
111
  :fluentd_output_status_rollback_count,
79
112
  'Current rollback counts.'),
113
+ flush_time_count: @registry.gauge(
114
+ :fluentd_output_status_flush_time_count,
115
+ 'Total flush time.'),
116
+ slow_flush_count: @registry.gauge(
117
+ :fluentd_output_status_slow_flush_count,
118
+ 'Current slow flush counts.'),
80
119
  retry_wait: @registry.gauge(
81
120
  :fluentd_output_status_retry_wait,
82
121
  'Current retry wait'),
83
122
  }
84
- end
85
-
86
- def start
87
- super
88
123
  timer_execute(:in_prometheus_output_monitor, @interval, &method(:update_monitor_info))
89
124
  end
90
125
 
@@ -99,8 +134,17 @@ module Fluent::Plugin
99
134
  }
100
135
 
101
136
  monitor_info = {
102
- 'buffer_queue_length' => @metrics[:buffer_queue_length],
137
+ # buffer metrics
103
138
  'buffer_total_queued_size' => @metrics[:buffer_total_queued_size],
139
+ 'buffer_stage_length' => @metrics[:buffer_stage_length],
140
+ 'buffer_stage_byte_size' => @metrics[:buffer_stage_byte_size],
141
+ 'buffer_queue_length' => @metrics[:buffer_queue_length],
142
+ 'buffer_queue_byte_size' => @metrics[:buffer_queue_byte_size],
143
+ 'buffer_available_buffer_space_ratios' => @metrics[:buffer_available_buffer_space_ratios],
144
+ 'buffer_newest_timekey' => @metrics[:buffer_newest_timekey],
145
+ 'buffer_oldest_timekey' => @metrics[:buffer_oldest_timekey],
146
+
147
+ # output metrics
104
148
  'retry_count' => @metrics[:retry_counts],
105
149
  }
106
150
  instance_vars_info = {
@@ -109,6 +153,8 @@ module Fluent::Plugin
109
153
  emit_count: @metrics[:emit_count],
110
154
  emit_records: @metrics[:emit_records],
111
155
  rollback_count: @metrics[:rollback_count],
156
+ flush_time_count: @metrics[:flush_time_count],
157
+ slow_flush_count: @metrics[:slow_flush_count],
112
158
  }
113
159
 
114
160
  agent_info.each do |info|
@@ -5,6 +5,7 @@ require 'fluent/plugin/prometheus'
5
5
  module Fluent::Plugin
6
6
  class PrometheusTailMonitorInput < Fluent::Plugin::Input
7
7
  Fluent::Plugin.register_input('prometheus_tail_monitor', self)
8
+ include Fluent::Plugin::PrometheusLabelParser
8
9
 
9
10
  helpers :timer
10
11
 
@@ -29,8 +30,11 @@ module Fluent::Plugin
29
30
  hostname = Socket.gethostname
30
31
  expander = Fluent::Plugin::Prometheus.placeholder_expander(log)
31
32
  placeholders = expander.prepare_placeholders({'hostname' => hostname, 'worker_id' => fluentd_worker_id})
32
- @base_labels = Fluent::Plugin::Prometheus.parse_labels_elements(conf)
33
+ @base_labels = parse_labels_elements(conf)
33
34
  @base_labels.each do |key, value|
35
+ unless value.is_a?(String)
36
+ raise Fluent::ConfigError, "record accessor syntax is not available in prometheus_tail_monitor"
37
+ end
34
38
  @base_labels[key] = expander.expand(value, placeholders)
35
39
  end
36
40
 
@@ -40,6 +44,10 @@ module Fluent::Plugin
40
44
  else
41
45
  @monitor_agent = Fluent::MonitorAgentInput.new
42
46
  end
47
+ end
48
+
49
+ def start
50
+ super
43
51
 
44
52
  @metrics = {
45
53
  position: @registry.gauge(
@@ -49,10 +57,6 @@ module Fluent::Plugin
49
57
  :fluentd_tail_file_inode,
50
58
  'Current inode of file.'),
51
59
  }
52
- end
53
-
54
- def start
55
- super
56
60
  timer_execute(:in_prometheus_tail_monitor, @interval, &method(:update_monitor_info))
57
61
  end
58
62
 
@@ -4,6 +4,7 @@ require 'fluent/plugin/prometheus'
4
4
  module Fluent::Plugin
5
5
  class PrometheusOutput < Fluent::Plugin::Output
6
6
  Fluent::Plugin.register_output('prometheus', self)
7
+ include Fluent::Plugin::PrometheusLabelParser
7
8
  include Fluent::Plugin::Prometheus
8
9
 
9
10
  def initialize
@@ -11,9 +12,13 @@ module Fluent::Plugin
11
12
  @registry = ::Prometheus::Client.registry
12
13
  end
13
14
 
15
+ def multi_workers_ready?
16
+ true
17
+ end
18
+
14
19
  def configure(conf)
15
20
  super
16
- labels = Fluent::Plugin::Prometheus.parse_labels_elements(conf)
21
+ labels = parse_labels_elements(conf)
17
22
  @metrics = Fluent::Plugin::Prometheus.parse_metrics_elements(conf, @registry, labels)
18
23
  end
19
24
 
@@ -4,6 +4,31 @@ require 'fluent/plugin/filter_record_transformer'
4
4
 
5
5
  module Fluent
6
6
  module Plugin
7
+ module PrometheusLabelParser
8
+ def configure(conf)
9
+ super
10
+ # Check if running with multiple workers
11
+ sysconf = if self.respond_to?(:owner) && owner.respond_to?(:system_config)
12
+ owner.system_config
13
+ elsif self.respond_to?(:system_config)
14
+ self.system_config
15
+ else
16
+ nil
17
+ end
18
+ @multi_worker = sysconf && sysconf.workers ? (sysconf.workers > 1) : false
19
+ end
20
+
21
+ def parse_labels_elements(conf)
22
+ base_labels = Fluent::Plugin::Prometheus.parse_labels_elements(conf)
23
+
24
+ if @multi_worker
25
+ base_labels[:worker_id] = fluentd_worker_id.to_s
26
+ end
27
+
28
+ base_labels
29
+ end
30
+ end
31
+
7
32
  module Prometheus
8
33
  class AlreadyRegisteredError < StandardError; end
9
34
 
@@ -63,10 +88,30 @@ module Fluent
63
88
 
64
89
  def configure(conf)
65
90
  super
91
+ @placeholder_values = {}
66
92
  @placeholder_expander = Fluent::Plugin::Prometheus.placeholder_expander(log)
67
93
  @hostname = Socket.gethostname
68
94
  end
69
95
 
96
+ def instrument_single(tag, time, record, metrics)
97
+ @placeholder_values[tag] ||= {
98
+ 'tag' => tag,
99
+ 'hostname' => @hostname,
100
+ 'worker_id' => fluentd_worker_id,
101
+ }
102
+
103
+ placeholders = record.merge(@placeholder_values[tag])
104
+ placeholders = @placeholder_expander.prepare_placeholders(placeholders)
105
+ metrics.each do |metric|
106
+ begin
107
+ metric.instrument(record, @placeholder_expander, placeholders)
108
+ rescue => e
109
+ log.warn "prometheus: failed to instrument a metric.", error_class: e.class, error: e, tag: tag, name: metric.name
110
+ router.emit_error_event(tag, time, record, e)
111
+ end
112
+ end
113
+ end
114
+
70
115
  def instrument(tag, es, metrics)
71
116
  placeholder_values = {
72
117
  'tag' => tag,
@@ -0,0 +1,77 @@
1
+ module Fluent::Plugin
2
+
3
+ ##
4
+ # PromMetricsAggregator aggregates multiples metrics exposed using Prometheus text-based format
5
+ # see https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md
6
+
7
+
8
+ class PrometheusMetrics
9
+ def initialize
10
+ @comments = []
11
+ @metrics = []
12
+ end
13
+
14
+ def to_string
15
+ (@comments + @metrics).join("\n")
16
+ end
17
+
18
+ def add_comment(comment)
19
+ @comments << comment
20
+ end
21
+
22
+ def add_metric_value(value)
23
+ @metrics << value
24
+ end
25
+
26
+ attr_writer :comments, :metrics
27
+ end
28
+
29
+ class PromMetricsAggregator
30
+ def initialize
31
+ @metrics = {}
32
+ end
33
+
34
+ def get_metric_name_from_comment(line)
35
+ tokens = line.split(' ')
36
+ if ['HELP', 'TYPE'].include?(tokens[1])
37
+ tokens[2]
38
+ else
39
+ ''
40
+ end
41
+ end
42
+
43
+ def add_metrics(metrics)
44
+ current_metric = ''
45
+ new_metric = false
46
+ lines = metrics.split("\n")
47
+ for line in lines
48
+ if line[0] == '#'
49
+ # Metric comment (# TYPE, # HELP)
50
+ parsed_metric = get_metric_name_from_comment(line)
51
+ if parsed_metric != ''
52
+ if parsed_metric != current_metric
53
+ # Starting a new metric comment block
54
+ new_metric = !@metrics.key?(parsed_metric)
55
+ if new_metric
56
+ @metrics[parsed_metric] = PrometheusMetrics.new()
57
+ end
58
+ current_metric = parsed_metric
59
+ end
60
+
61
+ if new_metric && parsed_metric == current_metric
62
+ # New metric, inject comments (# TYPE, # HELP)
63
+ @metrics[parsed_metric].add_comment(line)
64
+ end
65
+ end
66
+ else
67
+ # Metric value, simply append line
68
+ @metrics[current_metric].add_metric_value(line)
69
+ end
70
+ end
71
+ end
72
+
73
+ def get_metrics
74
+ @metrics.map{|k,v| v.to_string()}.join("\n") + (@metrics.length ? "\n" : "")
75
+ end
76
+ end
77
+ end
@@ -47,7 +47,7 @@ ALERT FluentdQueueLength
47
47
  }
48
48
 
49
49
  ALERT FluentdRecordsCountsHigh
50
- IF sum(rate(fluentd_record_counts{job="fluentd"}[5m])) BY (instance) > (3 * sum(rate(fluentd_record_counts{job="fluentd"}[15m])) BY (instance))
50
+ IF sum(rate(fluentd_output_status_emit_records{job="fluentd"}[5m])) BY (instance) > (3 * sum(rate(fluentd_output_status_emit_records{job="fluentd"}[15m])) BY (instance))
51
51
  FOR 1m
52
52
  LABELS {
53
53
  service = "fluentd",
@@ -8,8 +8,17 @@ describe Fluent::Plugin::PrometheusMonitorInput do
8
8
  <labels>
9
9
  host ${hostname}
10
10
  foo bar
11
- no_effect1 $.foo.bar
12
- no_effect2 $[0][1]
11
+ </labels>
12
+ ]
13
+
14
+ INVALID_MONITOR_CONFIG = %[
15
+ @type prometheus_monitor
16
+
17
+ <labels>
18
+ host ${hostname}
19
+ foo bar
20
+ invalid_use1 $.foo.bar
21
+ invalid_use2 $[0][1]
13
22
  </labels>
14
23
  ]
15
24
 
@@ -18,8 +27,17 @@ describe Fluent::Plugin::PrometheusMonitorInput do
18
27
  let(:driver) { Fluent::Test::Driver::Input.new(Fluent::Plugin::PrometheusMonitorInput).configure(config) }
19
28
 
20
29
  describe '#configure' do
21
- it 'does not raise error' do
22
- expect{driver}.not_to raise_error
30
+ describe 'valid' do
31
+ it 'does not raise error' do
32
+ expect{driver}.not_to raise_error
33
+ end
34
+ end
35
+
36
+ describe 'invalid' do
37
+ let(:config) { INVALID_MONITOR_CONFIG }
38
+ it 'expect raise error' do
39
+ expect{driver}.to raise_error
40
+ end
23
41
  end
24
42
  end
25
43
  end
@@ -0,0 +1,138 @@
1
+ require 'spec_helper'
2
+ require 'fluent/plugin/in_prometheus'
3
+ require 'fluent/test/driver/input'
4
+
5
+ require 'net/http'
6
+
7
+ describe Fluent::Plugin::PromMetricsAggregator do
8
+
9
+ metrics_worker_1 = %[# TYPE fluentd_status_buffer_queue_length gauge
10
+ # HELP fluentd_status_buffer_queue_length Current buffer queue length.
11
+ fluentd_status_buffer_queue_length{host="0123456789ab",worker_id="0",plugin_id="plugin-1",plugin_category="output",type="s3"} 0.0
12
+ fluentd_status_buffer_queue_length{host="0123456789ab",worker_id="0",plugin_id="plugin-2",plugin_category="output",type="s3"} 0.0
13
+ # TYPE fluentd_status_buffer_total_bytes gauge
14
+ # HELP fluentd_status_buffer_total_bytes Current total size of queued buffers.
15
+ fluentd_status_buffer_total_bytes{host="0123456789ab",worker_id="0",plugin_id="plugin-1",plugin_category="output",type="s3"} 0.0
16
+ fluentd_status_buffer_total_bytes{host="0123456789ab",worker_id="0",plugin_id="plugin-2",plugin_category="output",type="s3"} 0.0
17
+ # TYPE log_counter counter
18
+ # HELP log_counter the number of received logs
19
+ log_counter{worker_id="0",host="0123456789ab",tag="fluent.info"} 1.0
20
+ # HELP empty_metric A metric with no data
21
+ # TYPE empty_metric gauge
22
+ # HELP http_request_duration_seconds The HTTP request latencies in seconds.
23
+ # TYPE http_request_duration_seconds histogram
24
+ http_request_duration_seconds_bucket{code="200",worker_id="0",method="GET",le="0.005"} 58
25
+ http_request_duration_seconds_bucket{code="200",worker_id="0",method="GET",le="0.01"} 58
26
+ http_request_duration_seconds_bucket{code="200",worker_id="0",method="GET",le="0.05"} 59
27
+ http_request_duration_seconds_bucket{code="200",worker_id="0",method="GET",le="0.1"} 59
28
+ http_request_duration_seconds_bucket{code="200",worker_id="0",method="GET",le="1"} 59
29
+ http_request_duration_seconds_bucket{code="200",worker_id="0",method="GET",le="10"} 59
30
+ http_request_duration_seconds_bucket{code="200",worker_id="0",method="GET",le="+Inf"} 59
31
+ http_request_duration_seconds_sum{code="200",worker_id="0",method="GET"} 0.05046115500000003
32
+ http_request_duration_seconds_count{code="200",worker_id="0",method="GET"} 59
33
+ ]
34
+
35
+ metrics_worker_2 = %[# TYPE fluentd_output_status_buffer_queue_length gauge
36
+ # HELP fluentd_output_status_buffer_queue_length Current buffer queue length.
37
+ fluentd_output_status_buffer_queue_length{host="0123456789ab",worker_id="0",plugin_id="plugin-1",type="s3"} 0.0
38
+ fluentd_output_status_buffer_queue_length{host="0123456789ab",worker_id="0",plugin_id="plugin-2",type="s3"} 0.0
39
+ # TYPE fluentd_output_status_buffer_total_bytes gauge
40
+ # HELP fluentd_output_status_buffer_total_bytes Current total size of queued buffers.
41
+ fluentd_output_status_buffer_total_bytes{host="0123456789ab",worker_id="0",plugin_id="plugin-1",type="s3"} 0.0
42
+ fluentd_output_status_buffer_total_bytes{host="0123456789ab",worker_id="0",plugin_id="plugin-2",type="s3"} 0.0
43
+ ]
44
+
45
+ metrics_worker_3 = %[# TYPE fluentd_status_buffer_queue_length gauge
46
+ # HELP fluentd_status_buffer_queue_length Current buffer queue length.
47
+ fluentd_status_buffer_queue_length{host="0123456789ab",worker_id="1",plugin_id="plugin-1",plugin_category="output",type="s3"} 0.0
48
+ fluentd_status_buffer_queue_length{host="0123456789ab",worker_id="1",plugin_id="plugin-2",plugin_category="output",type="s3"} 0.0
49
+ # TYPE fluentd_status_buffer_total_bytes gauge
50
+ # HELP fluentd_status_buffer_total_bytes Current total size of queued buffers.
51
+ fluentd_status_buffer_total_bytes{host="0123456789ab",worker_id="1",plugin_id="plugin-1",plugin_category="output",type="s3"} 0.0
52
+ fluentd_status_buffer_total_bytes{host="0123456789ab",worker_id="1",plugin_id="plugin-2",plugin_category="output",type="s3"} 0.0
53
+ # HELP http_request_duration_seconds The HTTP request latencies in seconds.
54
+ # TYPE http_request_duration_seconds histogram
55
+ http_request_duration_seconds_bucket{code="200",worker_id="1",method="GET",le="0.005"} 70
56
+ http_request_duration_seconds_bucket{code="200",worker_id="1",method="GET",le="0.01"} 70
57
+ http_request_duration_seconds_bucket{code="200",worker_id="1",method="GET",le="0.05"} 71
58
+ http_request_duration_seconds_bucket{code="200",worker_id="1",method="GET",le="0.1"} 71
59
+ http_request_duration_seconds_bucket{code="200",worker_id="1",method="GET",le="1"} 71
60
+ http_request_duration_seconds_bucket{code="200",worker_id="1",method="GET",le="10"} 71
61
+ http_request_duration_seconds_bucket{code="200",worker_id="1",method="GET",le="+Inf"} 71
62
+ http_request_duration_seconds_sum{code="200",worker_id="1",method="GET"} 0.05646315600000003
63
+ http_request_duration_seconds_count{code="200",worker_id="1",method="GET"} 71
64
+ ]
65
+
66
+ metrics_merged_1_and_3 = %[# TYPE fluentd_status_buffer_queue_length gauge
67
+ # HELP fluentd_status_buffer_queue_length Current buffer queue length.
68
+ fluentd_status_buffer_queue_length{host="0123456789ab",worker_id="0",plugin_id="plugin-1",plugin_category="output",type="s3"} 0.0
69
+ fluentd_status_buffer_queue_length{host="0123456789ab",worker_id="0",plugin_id="plugin-2",plugin_category="output",type="s3"} 0.0
70
+ fluentd_status_buffer_queue_length{host="0123456789ab",worker_id="1",plugin_id="plugin-1",plugin_category="output",type="s3"} 0.0
71
+ fluentd_status_buffer_queue_length{host="0123456789ab",worker_id="1",plugin_id="plugin-2",plugin_category="output",type="s3"} 0.0
72
+ # TYPE fluentd_status_buffer_total_bytes gauge
73
+ # HELP fluentd_status_buffer_total_bytes Current total size of queued buffers.
74
+ fluentd_status_buffer_total_bytes{host="0123456789ab",worker_id="0",plugin_id="plugin-1",plugin_category="output",type="s3"} 0.0
75
+ fluentd_status_buffer_total_bytes{host="0123456789ab",worker_id="0",plugin_id="plugin-2",plugin_category="output",type="s3"} 0.0
76
+ fluentd_status_buffer_total_bytes{host="0123456789ab",worker_id="1",plugin_id="plugin-1",plugin_category="output",type="s3"} 0.0
77
+ fluentd_status_buffer_total_bytes{host="0123456789ab",worker_id="1",plugin_id="plugin-2",plugin_category="output",type="s3"} 0.0
78
+ # TYPE log_counter counter
79
+ # HELP log_counter the number of received logs
80
+ log_counter{worker_id="0",host="0123456789ab",tag="fluent.info"} 1.0
81
+ # HELP empty_metric A metric with no data
82
+ # TYPE empty_metric gauge
83
+ # HELP http_request_duration_seconds The HTTP request latencies in seconds.
84
+ # TYPE http_request_duration_seconds histogram
85
+ http_request_duration_seconds_bucket{code="200",worker_id="0",method="GET",le="0.005"} 58
86
+ http_request_duration_seconds_bucket{code="200",worker_id="0",method="GET",le="0.01"} 58
87
+ http_request_duration_seconds_bucket{code="200",worker_id="0",method="GET",le="0.05"} 59
88
+ http_request_duration_seconds_bucket{code="200",worker_id="0",method="GET",le="0.1"} 59
89
+ http_request_duration_seconds_bucket{code="200",worker_id="0",method="GET",le="1"} 59
90
+ http_request_duration_seconds_bucket{code="200",worker_id="0",method="GET",le="10"} 59
91
+ http_request_duration_seconds_bucket{code="200",worker_id="0",method="GET",le="+Inf"} 59
92
+ http_request_duration_seconds_sum{code="200",worker_id="0",method="GET"} 0.05046115500000003
93
+ http_request_duration_seconds_count{code="200",worker_id="0",method="GET"} 59
94
+ http_request_duration_seconds_bucket{code="200",worker_id="1",method="GET",le="0.005"} 70
95
+ http_request_duration_seconds_bucket{code="200",worker_id="1",method="GET",le="0.01"} 70
96
+ http_request_duration_seconds_bucket{code="200",worker_id="1",method="GET",le="0.05"} 71
97
+ http_request_duration_seconds_bucket{code="200",worker_id="1",method="GET",le="0.1"} 71
98
+ http_request_duration_seconds_bucket{code="200",worker_id="1",method="GET",le="1"} 71
99
+ http_request_duration_seconds_bucket{code="200",worker_id="1",method="GET",le="10"} 71
100
+ http_request_duration_seconds_bucket{code="200",worker_id="1",method="GET",le="+Inf"} 71
101
+ http_request_duration_seconds_sum{code="200",worker_id="1",method="GET"} 0.05646315600000003
102
+ http_request_duration_seconds_count{code="200",worker_id="1",method="GET"} 71
103
+ ]
104
+
105
+ describe 'add_metrics' do
106
+ context '1st_metrics' do
107
+ it 'adds all fields' do
108
+ all_metrics = Fluent::Plugin::PromMetricsAggregator.new
109
+ all_metrics.add_metrics(metrics_worker_1)
110
+ result_str = all_metrics.get_metrics
111
+
112
+ expect(result_str).to eq(metrics_worker_1)
113
+ end
114
+ end
115
+ context '2nd_metrics' do
116
+ it 'append new metrics' do
117
+ all_metrics = Fluent::Plugin::PromMetricsAggregator.new
118
+ all_metrics.add_metrics(metrics_worker_1)
119
+ all_metrics.add_metrics(metrics_worker_2)
120
+ result_str = all_metrics.get_metrics
121
+
122
+ expect(result_str).to eq(metrics_worker_1 + metrics_worker_2)
123
+ end
124
+ end
125
+
126
+ context '3rd_metrics' do
127
+ it 'append existing metrics in the right place' do
128
+ all_metrics = Fluent::Plugin::PromMetricsAggregator.new
129
+ all_metrics.add_metrics(metrics_worker_1)
130
+ all_metrics.add_metrics(metrics_worker_2)
131
+ all_metrics.add_metrics(metrics_worker_3)
132
+ result_str = all_metrics.get_metrics
133
+
134
+ expect(result_str).to eq(metrics_merged_1_and_3 + metrics_worker_2)
135
+ end
136
+ end
137
+ end
138
+ end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-prometheus
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.3.0
4
+ version: 1.7.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Masahiro Sano
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2019-01-05 00:00:00.000000000 Z
11
+ date: 2019-10-31 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: fluentd
@@ -34,16 +34,16 @@ dependencies:
34
34
  name: prometheus-client
35
35
  requirement: !ruby/object:Gem::Requirement
36
36
  requirements:
37
- - - ">="
37
+ - - "<"
38
38
  - !ruby/object:Gem::Version
39
- version: '0'
39
+ version: '0.10'
40
40
  type: :runtime
41
41
  prerelease: false
42
42
  version_requirements: !ruby/object:Gem::Requirement
43
43
  requirements:
44
- - - ">="
44
+ - - "<"
45
45
  - !ruby/object:Gem::Version
46
- version: '0'
46
+ version: '0.10'
47
47
  - !ruby/object:Gem::Dependency
48
48
  name: bundler
49
49
  requirement: !ruby/object:Gem::Requirement
@@ -122,6 +122,7 @@ files:
122
122
  - lib/fluent/plugin/in_prometheus_tail_monitor.rb
123
123
  - lib/fluent/plugin/out_prometheus.rb
124
124
  - lib/fluent/plugin/prometheus.rb
125
+ - lib/fluent/plugin/prometheus_metrics.rb
125
126
  - misc/fluentd_sample.conf
126
127
  - misc/nginx_proxy.conf
127
128
  - misc/prometheus.yaml
@@ -129,6 +130,7 @@ files:
129
130
  - spec/fluent/plugin/filter_prometheus_spec.rb
130
131
  - spec/fluent/plugin/in_prometheus_monitor_spec.rb
131
132
  - spec/fluent/plugin/out_prometheus_spec.rb
133
+ - spec/fluent/plugin/prometheus_metrics_spec.rb
132
134
  - spec/fluent/plugin/prometheus_spec.rb
133
135
  - spec/fluent/plugin/shared.rb
134
136
  - spec/spec_helper.rb
@@ -151,8 +153,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
151
153
  - !ruby/object:Gem::Version
152
154
  version: '0'
153
155
  requirements: []
154
- rubyforge_project:
155
- rubygems_version: 2.7.6
156
+ rubygems_version: 3.0.3
156
157
  signing_key:
157
158
  specification_version: 4
158
159
  summary: A fluent plugin that collects metrics and exposes for Prometheus.
@@ -160,6 +161,7 @@ test_files:
160
161
  - spec/fluent/plugin/filter_prometheus_spec.rb
161
162
  - spec/fluent/plugin/in_prometheus_monitor_spec.rb
162
163
  - spec/fluent/plugin/out_prometheus_spec.rb
164
+ - spec/fluent/plugin/prometheus_metrics_spec.rb
163
165
  - spec/fluent/plugin/prometheus_spec.rb
164
166
  - spec/fluent/plugin/shared.rb
165
167
  - spec/spec_helper.rb