fluent-plugin-elasticsearch 4.3.3 → 5.0.4
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.github/workflows/linux.yml +1 -1
- data/.github/workflows/macos.yml +1 -1
- data/.github/workflows/windows.yml +1 -1
- data/.travis.yml +0 -4
- data/History.md +24 -0
- data/README.ElasticsearchInput.md +1 -1
- data/README.Troubleshooting.md +692 -0
- data/README.md +115 -586
- data/fluent-plugin-elasticsearch.gemspec +2 -1
- data/lib/fluent/plugin/elasticsearch_error_handler.rb +2 -1
- data/lib/fluent/plugin/out_elasticsearch.rb +56 -3
- data/lib/fluent/plugin/out_elasticsearch_data_stream.rb +218 -0
- data/test/plugin/test_elasticsearch_error_handler.rb +6 -1
- data/test/plugin/test_out_elasticsearch.rb +296 -1
- data/test/plugin/test_out_elasticsearch_data_stream.rb +337 -0
- metadata +21 -4
- data/gemfiles/Gemfile.without.ilm +0 -10
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 7993521144deb5ebe2665d231cdeda9833cb4b3ba87909a2e21b96f8a43f61cc
|
4
|
+
data.tar.gz: 25e0341fe8d2b131350fb521c520fa31314cdcf478914aaad2d4a3bdbf5a3954
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: db5b6b3f9f4fc1e1d6a9c438e371aa73f962798f8a39519a01bfdeb38bfcb7a8b8e6b4efe20202108d518c33dd0701c29c20c7d4b7c0bb01b16f5907f34dea5d
|
7
|
+
data.tar.gz: 8c69a02d9cda795457f104177773939eb53e652a74f4ebeab4d8dc2b6f943222f9ca967eeca93b0ca283f8be4fa5d381b1300f762d6b5d7e77a07bebb4194524
|
data/.github/workflows/linux.yml
CHANGED
data/.github/workflows/macos.yml
CHANGED
data/.travis.yml
CHANGED
data/History.md
CHANGED
@@ -2,6 +2,30 @@
|
|
2
2
|
|
3
3
|
### [Unreleased]
|
4
4
|
|
5
|
+
### 5.0.4
|
6
|
+
- test: out_elasticsearch: Remove a needless headers from affinity stub (#888)
|
7
|
+
- Target Index Affinity (#883)
|
8
|
+
|
9
|
+
### 5.0.3
|
10
|
+
- Fix use_legacy_template documentation (#880)
|
11
|
+
- Add FAQ for dynamic index/template (#878)
|
12
|
+
- Handle IPv6 address string on host and hosts parameters (#877)
|
13
|
+
|
14
|
+
### 5.0.2
|
15
|
+
- GitHub Actions: Tweak Ruby versions on test (#875)
|
16
|
+
- test: datastreams: Set nonexistent datastream as default (#874)
|
17
|
+
- Fix overwriting of index template and index lifecycle policy on existing data streams (#872)
|
18
|
+
|
19
|
+
### 5.0.1
|
20
|
+
- Use elasticsearch/api instead of elasticsearch/xpack (#870)
|
21
|
+
|
22
|
+
### 5.0.0
|
23
|
+
- Support #retry_operate on data stream (#863)
|
24
|
+
- Support placeholder in @data\_stream\_name for @type elasticsearch\_data\_stream (#862)
|
25
|
+
- Extract troubleshooting section (#861)
|
26
|
+
- Fix unmatched `<source>` close tag (#860)
|
27
|
+
- Initial support for Elasticsearch Data Stream (#859)
|
28
|
+
|
5
29
|
### 4.3.3
|
6
30
|
- Handle invalid Elasticsearch::Client#info response (#855)
|
7
31
|
|
@@ -0,0 +1,692 @@
|
|
1
|
+
## Index
|
2
|
+
|
3
|
+
* [Troubleshooting](#troubleshooting)
|
4
|
+
+ [Cannot send events to elasticsearch](#cannot-send-events-to-elasticsearch)
|
5
|
+
+ [Cannot see detailed failure log](#cannot-see-detailed-failure-log)
|
6
|
+
+ [Cannot connect TLS enabled reverse Proxy](#cannot-connect-tls-enabled-reverse-proxy)
|
7
|
+
+ [Declined logs are resubmitted forever, why?](#declined-logs-are-resubmitted-forever-why)
|
8
|
+
+ [Suggested to install typhoeus gem, why?](#suggested-to-install-typhoeus-gem-why)
|
9
|
+
+ [Stopped to send events on k8s, why?](#stopped-to-send-events-on-k8s-why)
|
10
|
+
+ [Random 400 - Rejected by Elasticsearch is occured, why?](#random-400---rejected-by-elasticsearch-is-occured-why)
|
11
|
+
+ [Fluentd seems to hang if it unable to connect Elasticsearch, why?](#fluentd-seems-to-hang-if-it-unable-to-connect-elasticsearch-why)
|
12
|
+
+ [Enable Index Lifecycle Management](#enable-index-lifecycle-management)
|
13
|
+
+ [Configuring for dynamic index or template](#configuring-for-dynamic-index-or-template)
|
14
|
+
+ [How to specify index codec](#how-to-specify-index-codec)
|
15
|
+
+ [Cannot push logs to Elasticsearch with connect_write timeout reached, why?](#cannot-push-logs-to-elasticsearch-with-connect_write-timeout-reached-why)
|
16
|
+
|
17
|
+
|
18
|
+
## Troubleshooting
|
19
|
+
|
20
|
+
### Cannot send events to Elasticsearch
|
21
|
+
|
22
|
+
A common cause of failure is that you are trying to connect to an Elasticsearch instance with an incompatible version.
|
23
|
+
|
24
|
+
For example, td-agent currently bundles the 6.x series of the [elasticsearch-ruby](https://github.com/elastic/elasticsearch-ruby) library. This means that your Elasticsearch server also needs to be 6.x. You can check the actual version of the client library installed on your system by executing the following command.
|
25
|
+
|
26
|
+
```
|
27
|
+
# For td-agent users
|
28
|
+
$ /usr/sbin/td-agent-gem list elasticsearch
|
29
|
+
# For standalone Fluentd users
|
30
|
+
$ fluent-gem list elasticsearch
|
31
|
+
```
|
32
|
+
Or, fluent-plugin-elasticsearch v2.11.7 or later, users can inspect version incompatibility with the `validate_client_version` option:
|
33
|
+
|
34
|
+
```
|
35
|
+
validate_client_version true
|
36
|
+
```
|
37
|
+
|
38
|
+
If you get the following error message, please consider to install compatible elasticsearch client gems:
|
39
|
+
|
40
|
+
```
|
41
|
+
Detected ES 5 but you use ES client 6.1.0.
|
42
|
+
Please consider to use 5.x series ES client.
|
43
|
+
```
|
44
|
+
|
45
|
+
For further details of the version compatibility issue, please read [the official manual](https://github.com/elastic/elasticsearch-ruby#compatibility).
|
46
|
+
|
47
|
+
### Cannot see detailed failure log
|
48
|
+
|
49
|
+
A common cause of failure is that you are trying to connect to an Elasticsearch instance with an incompatible ssl protocol version.
|
50
|
+
|
51
|
+
For example, `out_elasticsearch` set up ssl_version to TLSv1 due to historical reason.
|
52
|
+
Modern Elasticsearch ecosystem requests to communicate with TLS v1.2 or later.
|
53
|
+
But, in this case, `out_elasticsearch` conceals transporter part failure log by default.
|
54
|
+
If you want to acquire transporter log, please consider to set the following configuration:
|
55
|
+
|
56
|
+
```
|
57
|
+
with_transporter_log true
|
58
|
+
@log_level debug
|
59
|
+
```
|
60
|
+
|
61
|
+
Then, the following log is shown in Fluentd log:
|
62
|
+
|
63
|
+
```
|
64
|
+
2018-10-24 10:00:00 +0900 [error]: #0 [Faraday::ConnectionFailed] SSL_connect returned=1 errno=0 state=SSLv2/v3 read server hello A: unknown protocol (OpenSSL::SSL::SSLError) {:host=>"elasticsearch-host", :port=>80, :scheme=>"https", :user=>"elastic", :password=>"changeme", :protocol=>"https"}
|
65
|
+
```
|
66
|
+
|
67
|
+
This indicates that inappropriate TLS protocol version is used.
|
68
|
+
If you want to use TLS v1.2, please use `ssl_version` parameter like as:
|
69
|
+
|
70
|
+
```
|
71
|
+
ssl_version TLSv1_2
|
72
|
+
```
|
73
|
+
|
74
|
+
or, in v4.0.2 or later with Ruby 2.5 or later combination, the following congiuration is also valid:
|
75
|
+
|
76
|
+
```
|
77
|
+
ssl_max_version TLSv1_2
|
78
|
+
ssl_min_version TLSv1_2
|
79
|
+
```
|
80
|
+
|
81
|
+
### Cannot connect TLS enabled reverse Proxy
|
82
|
+
|
83
|
+
A common cause of failure is that you are trying to connect to an Elasticsearch instance behind nginx reverse proxy which uses an incompatible ssl protocol version.
|
84
|
+
|
85
|
+
For example, `out_elasticsearch` set up ssl_version to TLSv1 due to historical reason.
|
86
|
+
Nowadays, nginx reverse proxy uses TLS v1.2 or later for security reason.
|
87
|
+
But, in this case, `out_elasticsearch` conceals transporter part failure log by default.
|
88
|
+
|
89
|
+
If you set up nginx reverse proxy with TLS v1.2:
|
90
|
+
|
91
|
+
```
|
92
|
+
server {
|
93
|
+
listen <your IP address>:9400;
|
94
|
+
server_name <ES-Host>;
|
95
|
+
ssl on;
|
96
|
+
ssl_certificate /etc/ssl/certs/server-bundle.pem;
|
97
|
+
ssl_certificate_key /etc/ssl/private/server-key.pem;
|
98
|
+
ssl_client_certificate /etc/ssl/certs/ca.pem;
|
99
|
+
ssl_verify_client on;
|
100
|
+
ssl_verify_depth 2;
|
101
|
+
|
102
|
+
# Reference : https://cipherli.st/
|
103
|
+
ssl_protocols TLSv1.2;
|
104
|
+
ssl_prefer_server_ciphers on;
|
105
|
+
ssl_ciphers "EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH";
|
106
|
+
ssl_ecdh_curve secp384r1; # Requires nginx >= 1.1.0
|
107
|
+
ssl_session_cache shared:SSL:10m;
|
108
|
+
ssl_session_tickets off; # Requires nginx >= 1.5.9
|
109
|
+
ssl_stapling on; # Requires nginx >= 1.3.7
|
110
|
+
ssl_stapling_verify on; # Requires nginx => 1.3.7
|
111
|
+
resolver 127.0.0.1 valid=300s;
|
112
|
+
resolver_timeout 5s;
|
113
|
+
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload";
|
114
|
+
add_header X-Frame-Options DENY;
|
115
|
+
add_header X-Content-Type-Options nosniff;
|
116
|
+
|
117
|
+
client_max_body_size 64M;
|
118
|
+
keepalive_timeout 5;
|
119
|
+
|
120
|
+
location / {
|
121
|
+
proxy_set_header Host $host;
|
122
|
+
proxy_set_header X-Real-IP $remote_addr;
|
123
|
+
proxy_pass http://localhost:9200;
|
124
|
+
}
|
125
|
+
}
|
126
|
+
```
|
127
|
+
|
128
|
+
Then, nginx reverse proxy starts with TLSv1.2.
|
129
|
+
|
130
|
+
Fluentd suddenly dies with the following log:
|
131
|
+
```
|
132
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: log writing failed. execution expired
|
133
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/ssl_socket.rb:10:in `initialize': stack level too deep (SystemStackError)
|
134
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/connection.rb:429:in `new'
|
135
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/connection.rb:429:in `socket'
|
136
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/connection.rb:111:in `request_call'
|
137
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/middlewares/mock.rb:48:in `request_call'
|
138
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/middlewares/instrumentor.rb:26:in `request_call'
|
139
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/middlewares/base.rb:16:in `request_call'
|
140
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/middlewares/base.rb:16:in `request_call'
|
141
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/middlewares/base.rb:16:in `request_call'
|
142
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: ... 9266 levels...
|
143
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/td-agent/embedded/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
|
144
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.5/bin/fluentd:8:in `<top (required)>'
|
145
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/bin/fluentd:22:in `load'
|
146
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/bin/fluentd:22:in `<main>'
|
147
|
+
Oct 31 9:44:45 <ES-Host> systemd[1]: fluentd.service: Control process exited, code=exited status=1
|
148
|
+
```
|
149
|
+
|
150
|
+
If you want to acquire transporter log, please consider to set the following configuration:
|
151
|
+
|
152
|
+
```
|
153
|
+
with_transporter_log true
|
154
|
+
@log_level debug
|
155
|
+
```
|
156
|
+
|
157
|
+
Then, the following log is shown in Fluentd log:
|
158
|
+
|
159
|
+
```
|
160
|
+
2018-10-31 10:00:57 +0900 [warn]: #7 [Faraday::ConnectionFailed] Attempt 2 connecting to {:host=>"<ES-Host>", :port=>9400, :scheme=>"https", :protocol=>"https"}
|
161
|
+
2018-10-31 10:00:57 +0900 [error]: #7 [Faraday::ConnectionFailed] Connection reset by peer - SSL_connect (Errno::ECONNRESET) {:host=>"<ES-Host>", :port=>9400, :scheme=>"https", :protocol=>"https"}
|
162
|
+
```
|
163
|
+
|
164
|
+
The above logs indicates that using incompatible SSL/TLS version between fluent-plugin-elasticsearch and nginx, which is reverse proxy, is root cause of this issue.
|
165
|
+
|
166
|
+
If you want to use TLS v1.2, please use `ssl_version` parameter like as:
|
167
|
+
|
168
|
+
```
|
169
|
+
ssl_version TLSv1_2
|
170
|
+
```
|
171
|
+
|
172
|
+
or, in v4.0.2 or later with Ruby 2.5 or later combination, the following congiuration is also valid:
|
173
|
+
|
174
|
+
```
|
175
|
+
ssl_max_version TLSv1_2
|
176
|
+
ssl_min_version TLSv1_2
|
177
|
+
```
|
178
|
+
|
179
|
+
### Declined logs are resubmitted forever, why?
|
180
|
+
|
181
|
+
Sometimes users write Fluentd configuration like this:
|
182
|
+
|
183
|
+
```aconf
|
184
|
+
<match **>
|
185
|
+
@type elasticsearch
|
186
|
+
host localhost
|
187
|
+
port 9200
|
188
|
+
type_name fluentd
|
189
|
+
logstash_format true
|
190
|
+
time_key @timestamp
|
191
|
+
include_timestamp true
|
192
|
+
reconnect_on_error true
|
193
|
+
reload_on_failure true
|
194
|
+
reload_connections false
|
195
|
+
request_timeout 120s
|
196
|
+
</match>
|
197
|
+
```
|
198
|
+
|
199
|
+
The above configuration does not use [`@label` feature](https://docs.fluentd.org/v1.0/articles/config-file#(5)-group-filter-and-output:-the-%E2%80%9Clabel%E2%80%9D-directive) and use glob(**) pattern.
|
200
|
+
It is usually problematic configuration.
|
201
|
+
|
202
|
+
In error scenario, error events will be emitted with `@ERROR` label, and `fluent.*` tag.
|
203
|
+
The black hole glob pattern resubmits a problematic event into pushing Elasticsearch pipeline.
|
204
|
+
|
205
|
+
This situation causes flood of declined log:
|
206
|
+
|
207
|
+
```log
|
208
|
+
2018-11-13 11:16:27 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::ElasticsearchErrorHandler::ElasticsearchError error="400 - Rejected by Elasticsearch" location=nil tag="app.fluentcat" time=2018-11-13 11:16:17.492985640 +0000 record={"message"=>"\xFF\xAD"}
|
209
|
+
2018-11-13 11:16:38 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::ElasticsearchErrorHandler::ElasticsearchError error="400 - Rejected by Elasticsearch" location=nil tag="fluent.warn" time=2018-11-13 11:16:27.978851140 +0000 record={"error"=>"#<Fluent::Plugin::ElasticsearchErrorHandler::ElasticsearchError: 400 - Rejected by Elasticsearch>", "location"=>nil, "tag"=>"app.fluentcat", "time"=>2018-11-13 11:16:17.492985640 +0000, "record"=>{"message"=>"\xFF\xAD"}, "message"=>"dump an error event: error_class=Fluent::Plugin::ElasticsearchErrorHandler::ElasticsearchError error=\"400 - Rejected by Elasticsearch\" location=nil tag=\"app.fluentcat\" time=2018-11-13 11:16:17.492985640 +0000 record={\"message\"=>\"\\xFF\\xAD\"}"}
|
210
|
+
```
|
211
|
+
|
212
|
+
Then, user should use more concrete tag route or use `@label`.
|
213
|
+
The following sections show two examples how to solve flood of declined log.
|
214
|
+
One is using concrete tag routing, the other is using label routing.
|
215
|
+
|
216
|
+
#### Using concrete tag routing
|
217
|
+
|
218
|
+
The following configuration uses concrete tag route:
|
219
|
+
|
220
|
+
```aconf
|
221
|
+
<match out.elasticsearch.**>
|
222
|
+
@type elasticsearch
|
223
|
+
host localhost
|
224
|
+
port 9200
|
225
|
+
type_name fluentd
|
226
|
+
logstash_format true
|
227
|
+
time_key @timestamp
|
228
|
+
include_timestamp true
|
229
|
+
reconnect_on_error true
|
230
|
+
reload_on_failure true
|
231
|
+
reload_connections false
|
232
|
+
request_timeout 120s
|
233
|
+
</match>
|
234
|
+
```
|
235
|
+
|
236
|
+
#### Using label feature
|
237
|
+
|
238
|
+
The following configuration uses label:
|
239
|
+
|
240
|
+
```aconf
|
241
|
+
<source>
|
242
|
+
@type forward
|
243
|
+
@label @ES
|
244
|
+
</source>
|
245
|
+
<label @ES>
|
246
|
+
<match out.elasticsearch.**>
|
247
|
+
@type elasticsearch
|
248
|
+
host localhost
|
249
|
+
port 9200
|
250
|
+
type_name fluentd
|
251
|
+
logstash_format true
|
252
|
+
time_key @timestamp
|
253
|
+
include_timestamp true
|
254
|
+
reconnect_on_error true
|
255
|
+
reload_on_failure true
|
256
|
+
reload_connections false
|
257
|
+
request_timeout 120s
|
258
|
+
</match>
|
259
|
+
</label>
|
260
|
+
<label @ERROR>
|
261
|
+
<match **>
|
262
|
+
@type stdout
|
263
|
+
</match>
|
264
|
+
</label>
|
265
|
+
```
|
266
|
+
|
267
|
+
### Suggested to install typhoeus gem, why?
|
268
|
+
|
269
|
+
fluent-plugin-elasticsearch doesn't depend on typhoeus gem by default.
|
270
|
+
If you want to use typhoeus backend, you must install typhoeus gem by your own.
|
271
|
+
|
272
|
+
If you use vanilla Fluentd, you can install it by:
|
273
|
+
|
274
|
+
```
|
275
|
+
gem install typhoeus
|
276
|
+
```
|
277
|
+
|
278
|
+
But, you use td-agent instead of vanilla Fluentd, you have to use `td-agent-gem`:
|
279
|
+
|
280
|
+
```
|
281
|
+
td-agent-gem install typhoeus
|
282
|
+
```
|
283
|
+
|
284
|
+
In more detail, please refer to [the official plugin management document](https://docs.fluentd.org/v1.0/articles/plugin-management).
|
285
|
+
|
286
|
+
### Stopped to send events on k8s, why?
|
287
|
+
|
288
|
+
fluent-plugin-elasticsearch reloads connection after 10000 requests. (Not correspond to events counts because ES plugin uses bulk API.)
|
289
|
+
|
290
|
+
This functionality which is originated from elasticsearch-ruby gem is enabled by default.
|
291
|
+
|
292
|
+
Sometimes this reloading functionality bothers users to send events with ES plugin.
|
293
|
+
|
294
|
+
On k8s platform, users sometimes shall specify the following settings:
|
295
|
+
|
296
|
+
```aconf
|
297
|
+
reload_connections false
|
298
|
+
reconnect_on_error true
|
299
|
+
reload_on_failure true
|
300
|
+
```
|
301
|
+
|
302
|
+
If you use [fluentd-kubernetes-daemonset](https://github.com/fluent/fluentd-kubernetes-daemonset), you can specify them with environment variables:
|
303
|
+
|
304
|
+
* `FLUENT_ELASTICSEARCH_RELOAD_CONNECTIONS` as `false`
|
305
|
+
* `FLUENT_ELASTICSEARCH_RECONNECT_ON_ERROR` as `true`
|
306
|
+
* `FLUENT_ELASTICSEARCH_RELOAD_ON_FAILURE` as `true`
|
307
|
+
|
308
|
+
This issue had been reported at [#525](https://github.com/uken/fluent-plugin-elasticsearch/issues/525).
|
309
|
+
|
310
|
+
### Random 400 - Rejected by Elasticsearch is occured, why?
|
311
|
+
|
312
|
+
Index templates installed Elasticsearch sometimes generates 400 - Rejected by Elasticsearch errors.
|
313
|
+
For example, kubernetes audit log has structure:
|
314
|
+
|
315
|
+
```json
|
316
|
+
"responseObject":{
|
317
|
+
"kind":"SubjectAccessReview",
|
318
|
+
"apiVersion":"authorization.k8s.io/v1beta1",
|
319
|
+
"metadata":{
|
320
|
+
"creationTimestamp":null
|
321
|
+
},
|
322
|
+
"spec":{
|
323
|
+
"nonResourceAttributes":{
|
324
|
+
"path":"/",
|
325
|
+
"verb":"get"
|
326
|
+
},
|
327
|
+
"user":"system:anonymous",
|
328
|
+
"group":[
|
329
|
+
"system:unauthenticated"
|
330
|
+
]
|
331
|
+
},
|
332
|
+
"status":{
|
333
|
+
"allowed":true,
|
334
|
+
"reason":"RBAC: allowed by ClusterRoleBinding \"cluster-system-anonymous\" of ClusterRole \"cluster-admin\" to User \"system:anonymous\""
|
335
|
+
}
|
336
|
+
},
|
337
|
+
```
|
338
|
+
|
339
|
+
The last element `status` sometimes becomes `"status":"Success"`.
|
340
|
+
This element type glich causes status 400 error.
|
341
|
+
|
342
|
+
There are some solutions for fixing this:
|
343
|
+
|
344
|
+
#### Solution 1
|
345
|
+
|
346
|
+
For a key which causes element type glich case.
|
347
|
+
|
348
|
+
Using dymanic mapping with the following template:
|
349
|
+
|
350
|
+
```json
|
351
|
+
{
|
352
|
+
"template": "YOURINDEXNAME-*",
|
353
|
+
"mappings": {
|
354
|
+
"fluentd": {
|
355
|
+
"dynamic_templates": [
|
356
|
+
{
|
357
|
+
"default_no_index": {
|
358
|
+
"path_match": "^.*$",
|
359
|
+
"path_unmatch": "^(@timestamp|auditID|level|stage|requestURI|sourceIPs|metadata|objectRef|user|verb)(\\..+)?$",
|
360
|
+
"match_pattern": "regex",
|
361
|
+
"mapping": {
|
362
|
+
"index": false,
|
363
|
+
"enabled": false
|
364
|
+
}
|
365
|
+
}
|
366
|
+
}
|
367
|
+
]
|
368
|
+
}
|
369
|
+
}
|
370
|
+
}
|
371
|
+
```
|
372
|
+
|
373
|
+
Note that `YOURINDEXNAME` should be replaced with your using index prefix.
|
374
|
+
|
375
|
+
#### Solution 2
|
376
|
+
|
377
|
+
For unstable `responseObject` and `requestObject` key existence case.
|
378
|
+
|
379
|
+
```aconf
|
380
|
+
<filter YOURROUTETAG>
|
381
|
+
@id kube_api_audit_normalize
|
382
|
+
@type record_transformer
|
383
|
+
auto_typecast false
|
384
|
+
enable_ruby true
|
385
|
+
<record>
|
386
|
+
host "#{ENV['K8S_NODE_NAME']}"
|
387
|
+
responseObject ${record["responseObject"].nil? ? "none": record["responseObject"].to_json}
|
388
|
+
requestObject ${record["requestObject"].nil? ? "none": record["requestObject"].to_json}
|
389
|
+
origin kubernetes-api-audit
|
390
|
+
</record>
|
391
|
+
</filter>
|
392
|
+
```
|
393
|
+
|
394
|
+
Normalize `responseObject` and `requestObject` key with record_transformer and other similiar plugins is needed.
|
395
|
+
|
396
|
+
### Fluentd seems to hang if it unable to connect Elasticsearch, why?
|
397
|
+
|
398
|
+
On `#configure` phase, ES plugin should wait until ES instance communication is succeeded.
|
399
|
+
And ES plugin blocks to launch Fluentd by default.
|
400
|
+
Because Fluentd requests to set up configuration correctly on `#configure` phase.
|
401
|
+
|
402
|
+
After `#configure` phase, it runs very fast and send events heavily in some heavily using case.
|
403
|
+
|
404
|
+
In this scenario, we need to set up configuration correctly until `#configure` phase.
|
405
|
+
So, we provide default parameter is too conservative to use advanced users.
|
406
|
+
|
407
|
+
To remove too pessimistic behavior, you can use the following configuration:
|
408
|
+
|
409
|
+
```aconf
|
410
|
+
<match **>
|
411
|
+
@type elasticsearch
|
412
|
+
# Some advanced users know their using ES version.
|
413
|
+
# We can disable startup ES version checking.
|
414
|
+
verify_es_version_at_startup false
|
415
|
+
# If you know that your using ES major version is 7, you can set as 7 here.
|
416
|
+
default_elasticsearch_version 7
|
417
|
+
# If using very stable ES cluster, you can reduce retry operation counts. (minmum is 1)
|
418
|
+
max_retry_get_es_version 1
|
419
|
+
# If using very stable ES cluster, you can reduce retry operation counts. (minmum is 1)
|
420
|
+
max_retry_putting_template 1
|
421
|
+
# ... and some ES plugin configuration
|
422
|
+
</match>
|
423
|
+
```
|
424
|
+
|
425
|
+
### Enable Index Lifecycle Management
|
426
|
+
|
427
|
+
Index lifecycle management is template based index management feature.
|
428
|
+
|
429
|
+
Main ILM feature parameters are:
|
430
|
+
|
431
|
+
* `index_name` (when logstash_format as false)
|
432
|
+
* `logstash_prefix` (when logstash_format as true)
|
433
|
+
* `enable_ilm`
|
434
|
+
* `ilm_policy_id`
|
435
|
+
* `ilm_policy`
|
436
|
+
|
437
|
+
* Advanced usage parameters
|
438
|
+
* `application_name`
|
439
|
+
* `index_separator`
|
440
|
+
|
441
|
+
They are not all mandatory parameters but they are used for ILM feature in effect.
|
442
|
+
|
443
|
+
ILM target index alias is created with `index_name` or an index which is calculated from `logstash_prefix`.
|
444
|
+
|
445
|
+
From Elasticsearch plugin v4.0.0, ILM target index will be calculated from `index_name` (normal mode) or `logstash_prefix` (using with `logstash_format`as true).
|
446
|
+
|
447
|
+
**NOTE:** Before Elasticsearch plugin v4.1.0, using `deflector_alias` parameter when ILM is enabled is permitted and handled, but, in the later releases such that 4.1.1 or later, it cannot use with when ILM is enabled.
|
448
|
+
|
449
|
+
And also, ILM feature users should specify their Elasticsearch template for ILM enabled indices.
|
450
|
+
Because ILM settings are injected into their Elasticsearch templates.
|
451
|
+
|
452
|
+
`application_name` and `index_separator` also affect alias index names.
|
453
|
+
|
454
|
+
But this parameter is prepared for advanced usage.
|
455
|
+
|
456
|
+
It usually should be used with default value which is `default`.
|
457
|
+
|
458
|
+
Then, ILM parameters are used in alias index like as:
|
459
|
+
|
460
|
+
##### Simple `index_name` case:
|
461
|
+
|
462
|
+
`<index_name><index_separator><application_name>-000001`.
|
463
|
+
|
464
|
+
##### `logstash_format` as `true` case:
|
465
|
+
|
466
|
+
`<logstash_prefix><logstash_prefix_separator><application_name><logstash_prefix_separator><logstash_dateformat>-000001`.
|
467
|
+
|
468
|
+
#### Example ILM settings
|
469
|
+
|
470
|
+
```aconf
|
471
|
+
index_name fluentd-${tag}
|
472
|
+
application_name ${tag}
|
473
|
+
index_date_pattern "now/d"
|
474
|
+
enable_ilm true
|
475
|
+
# Policy configurations
|
476
|
+
ilm_policy_id fluentd-policy
|
477
|
+
# ilm_policy {} # Use default policy
|
478
|
+
template_name your-fluentd-template
|
479
|
+
template_file /path/to/fluentd-template.json
|
480
|
+
# customize_template {"<<index_prefix>>": "fluentd"}
|
481
|
+
```
|
482
|
+
|
483
|
+
Note: This plugin only creates rollover-enabled indices, which are aliases pointing to them and index templates, and creates an ILM policy if enabled.
|
484
|
+
|
485
|
+
#### Create ILM indices in each day
|
486
|
+
|
487
|
+
If you want to create new index in each day, you should use `logstash_format` style configuration:
|
488
|
+
|
489
|
+
```aconf
|
490
|
+
logstash_prefix fluentd
|
491
|
+
application_name default
|
492
|
+
index_date_pattern "now/d"
|
493
|
+
enable_ilm true
|
494
|
+
# Policy configurations
|
495
|
+
ilm_policy_id fluentd-policy
|
496
|
+
# ilm_policy {} # Use default policy
|
497
|
+
template_name your-fluentd-template
|
498
|
+
template_file /path/to/fluentd-template.json
|
499
|
+
```
|
500
|
+
|
501
|
+
Note that if you create a new set of indexes every day, the elasticsearch ILM policy system will treat each day separately and will always
|
502
|
+
maintain a separate active write index for each day.
|
503
|
+
|
504
|
+
If you have a rollover based on max_age, it will continue to roll the indexes for prior dates even if no new documents are indexed. If you want
|
505
|
+
to delete indexes after a period of time, the ILM policy will never delete the current write index regardless of its age, so you would need a separate
|
506
|
+
system, such as curator, to actually delete the old indexes.
|
507
|
+
|
508
|
+
For this reason, if you put the date into the index names with ILM you should only rollover based on size or number of documents and may need to use
|
509
|
+
curator to actually delete old indexes.
|
510
|
+
|
511
|
+
#### Fixed ILM indices
|
512
|
+
|
513
|
+
Also, users can use fixed ILM indices configuration.
|
514
|
+
If `index_date_pattern` is set as `""`(empty string), Elasticsearch plugin won't attach date pattern in ILM indices:
|
515
|
+
|
516
|
+
```aconf
|
517
|
+
index_name fluentd
|
518
|
+
application_name default
|
519
|
+
index_date_pattern ""
|
520
|
+
enable_ilm true
|
521
|
+
# Policy configurations
|
522
|
+
ilm_policy_id fluentd-policy
|
523
|
+
# ilm_policy {} # Use default policy
|
524
|
+
template_name your-fluentd-template
|
525
|
+
template_file /path/to/fluentd-template.json
|
526
|
+
```
|
527
|
+
|
528
|
+
#### Configuring for dynamic index or template
|
529
|
+
|
530
|
+
Some users want to setup ILM for dynamic index/template.
|
531
|
+
`index_petterns` and `template.settings.index.lifecycle.name` in Elasticsearch template will be overwritten by Elasticsearch plugin:
|
532
|
+
|
533
|
+
```json
|
534
|
+
{
|
535
|
+
"index_patterns": ["mock"],
|
536
|
+
"template": {
|
537
|
+
"settings": {
|
538
|
+
"index": {
|
539
|
+
"lifecycle": {
|
540
|
+
"name": "mock",
|
541
|
+
"rollover_alias": "mock"
|
542
|
+
},
|
543
|
+
"number_of_shards": "<<shard>>",
|
544
|
+
"number_of_replicas": "<<replica>>"
|
545
|
+
}
|
546
|
+
}
|
547
|
+
}
|
548
|
+
}
|
549
|
+
```
|
550
|
+
|
551
|
+
This template will be handled with:
|
552
|
+
|
553
|
+
```aconf
|
554
|
+
<source>
|
555
|
+
@type http
|
556
|
+
port 5004
|
557
|
+
bind 0.0.0.0
|
558
|
+
body_size_limit 32m
|
559
|
+
keepalive_timeout 10s
|
560
|
+
<parse>
|
561
|
+
@type json
|
562
|
+
</parse>
|
563
|
+
</source>
|
564
|
+
|
565
|
+
<match kubernetes.var.log.containers.**etl-webserver**.log>
|
566
|
+
@type elasticsearch
|
567
|
+
@id out_es_etl_webserver
|
568
|
+
@log_level info
|
569
|
+
include_tag_key true
|
570
|
+
host $HOST
|
571
|
+
port $PORT
|
572
|
+
path "#{ENV['FLUENT_ELASTICSEARCH_PATH']}"
|
573
|
+
request_timeout "#{ENV['FLUENT_ELASTICSEARCH_REQUEST_TIMEOUT'] || '30s'}"
|
574
|
+
scheme "#{ENV['FLUENT_ELASTICSEARCH_SCHEME'] || 'http'}"
|
575
|
+
ssl_verify "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERIFY'] || 'true'}"
|
576
|
+
ssl_version "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERSION'] || 'TLSv1'}"
|
577
|
+
reload_connections "#{ENV['FLUENT_ELASTICSEARCH_RELOAD_CONNECTIONS'] || 'false'}"
|
578
|
+
reconnect_on_error "#{ENV['FLUENT_ELASTICSEARCH_RECONNECT_ON_ERROR'] || 'true'}"
|
579
|
+
reload_on_failure "#{ENV['FLUENT_ELASTICSEARCH_RELOAD_ON_FAILURE'] || 'true'}"
|
580
|
+
log_es_400_reason "#{ENV['FLUENT_ELASTICSEARCH_LOG_ES_400_REASON'] || 'false'}"
|
581
|
+
logstash_prefix "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_PREFIX'] || 'etl-webserver'}"
|
582
|
+
logstash_format "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_FORMAT'] || 'false'}"
|
583
|
+
index_name "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_INDEX_NAME'] || 'etl-webserver'}"
|
584
|
+
type_name "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_TYPE_NAME'] || 'fluentd'}"
|
585
|
+
time_key "#{ENV['FLUENT_ELASTICSEARCH_TIME_KEY'] || '@timestamp'}"
|
586
|
+
include_timestamp "#{ENV['FLUENT_ELASTICSEARCH_INCLUDE_TIMESTAMP'] || 'true'}"
|
587
|
+
|
588
|
+
# ILM Settings - WITH ROLLOVER support
|
589
|
+
# https://github.com/uken/fluent-plugin-elasticsearch#enable-index-lifecycle-management
|
590
|
+
application_name "etl-webserver"
|
591
|
+
index_date_pattern ""
|
592
|
+
# Policy configurations
|
593
|
+
enable_ilm true
|
594
|
+
ilm_policy_id etl-webserver
|
595
|
+
ilm_policy_overwrite true
|
596
|
+
ilm_policy {"policy": {"phases": {"hot": {"min_age": "0ms","actions": {"rollover": {"max_age": "5m","max_size": "3gb"},"set_priority": {"priority": 100}}},"delete": {"min_age": "30d","actions": {"delete": {"delete_searchable_snapshot": true}}}}}}
|
597
|
+
use_legacy_template false
|
598
|
+
template_name etl-webserver
|
599
|
+
template_file /configs/index-template.json
|
600
|
+
template_overwrite true
|
601
|
+
customize_template {"<<shard>>": "3","<<replica>>": "0"}
|
602
|
+
|
603
|
+
<buffer>
|
604
|
+
flush_thread_count "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_FLUSH_THREAD_COUNT'] || '8'}"
|
605
|
+
flush_interval "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_FLUSH_INTERVAL'] || '5s'}"
|
606
|
+
chunk_limit_size "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_CHUNK_LIMIT_SIZE'] || '8MB'}"
|
607
|
+
total_limit_size "#{ENV['FLUENT_ELASTICSEARCH_TOTAL_LIMIT_SIZE'] || '450MB'}"
|
608
|
+
queue_limit_length "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_QUEUE_LIMIT_LENGTH'] || '32'}"
|
609
|
+
retry_max_interval "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_RETRY_MAX_INTERVAL'] || '60s'}"
|
610
|
+
retry_forever false
|
611
|
+
</buffer>
|
612
|
+
</match>
|
613
|
+
```
|
614
|
+
|
615
|
+
For more details, please refer the discussion:
|
616
|
+
https://github.com/uken/fluent-plugin-elasticsearch/issues/867
|
617
|
+
|
618
|
+
### How to specify index codec
|
619
|
+
|
620
|
+
Elasticsearch can handle compression methods for stored data such as LZ4 and best_compression.
|
621
|
+
fluent-plugin-elasticsearch doesn't provide API which specifies compression method.
|
622
|
+
|
623
|
+
Users can specify stored data compression method with template:
|
624
|
+
|
625
|
+
Create `compression.json` as follows:
|
626
|
+
|
627
|
+
```json
|
628
|
+
{
|
629
|
+
"order": 100,
|
630
|
+
"index_patterns": [
|
631
|
+
"YOUR-INDEX-PATTERN"
|
632
|
+
],
|
633
|
+
"settings": {
|
634
|
+
"index": {
|
635
|
+
"codec": "best_compression"
|
636
|
+
}
|
637
|
+
}
|
638
|
+
}
|
639
|
+
```
|
640
|
+
|
641
|
+
Then, specify the above template in your configuration:
|
642
|
+
|
643
|
+
```aconf
|
644
|
+
template_name best_compression_tmpl
|
645
|
+
template_file compression.json
|
646
|
+
```
|
647
|
+
|
648
|
+
Elasticsearch will store data with `best_compression`:
|
649
|
+
|
650
|
+
```
|
651
|
+
% curl -XGET 'http://localhost:9200/logstash-2019.12.06/_settings?pretty'
|
652
|
+
```
|
653
|
+
|
654
|
+
```json
|
655
|
+
{
|
656
|
+
"logstash-2019.12.06" : {
|
657
|
+
"settings" : {
|
658
|
+
"index" : {
|
659
|
+
"codec" : "best_compression",
|
660
|
+
"number_of_shards" : "1",
|
661
|
+
"provided_name" : "logstash-2019.12.06",
|
662
|
+
"creation_date" : "1575622843800",
|
663
|
+
"number_of_replicas" : "1",
|
664
|
+
"uuid" : "THE_AWESOMEUUID",
|
665
|
+
"version" : {
|
666
|
+
"created" : "7040100"
|
667
|
+
}
|
668
|
+
}
|
669
|
+
}
|
670
|
+
}
|
671
|
+
}
|
672
|
+
```
|
673
|
+
|
674
|
+
### Cannot push logs to Elasticsearch with connect_write timeout reached, why?
|
675
|
+
|
676
|
+
It seems that Elasticsearch cluster is exhausted.
|
677
|
+
|
678
|
+
Usually, Fluentd complains like the following log:
|
679
|
+
|
680
|
+
```log
|
681
|
+
2019-12-29 00:23:33 +0000 [warn]: buffer flush took longer time than slow_flush_log_threshold: elapsed_time=27.283766102716327 slow_flush_log_threshold=15.0 plugin_id="object:aaaffaaaaaff"
|
682
|
+
2019-12-29 00:23:33 +0000 [warn]: buffer flush took longer time than slow_flush_log_threshold: elapsed_time=26.161768959928304 slow_flush_log_threshold=15.0 plugin_id="object:aaaffaaaaaff"
|
683
|
+
2019-12-29 00:23:33 +0000 [warn]: buffer flush took longer time than slow_flush_log_threshold: elapsed_time=28.713624476008117 slow_flush_log_threshold=15.0 plugin_id="object:aaaffaaaaaff"
|
684
|
+
2019-12-29 01:39:18 +0000 [warn]: Could not push logs to Elasticsearch, resetting connection and trying again. connect_write timeout reached
|
685
|
+
2019-12-29 01:39:18 +0000 [warn]: Could not push logs to Elasticsearch, resetting connection and trying again. connect_write timeout reached
|
686
|
+
```
|
687
|
+
|
688
|
+
This warnings is usually caused by exhaused Elasticsearch cluster due to resource shortage.
|
689
|
+
|
690
|
+
If CPU usage is spiked and Elasticsearch cluster is eating up CPU resource, this issue is caused by CPU resource shortage.
|
691
|
+
|
692
|
+
Check your Elasticsearch cluster health status and resource usage.
|