fluent-plugin-elasticsearch 4.3.3 → 5.0.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.travis.yml +0 -4
- data/History.md +7 -0
- data/README.ElasticsearchInput.md +1 -1
- data/README.Troubleshooting.md +601 -0
- data/README.md +15 -584
- data/fluent-plugin-elasticsearch.gemspec +1 -1
- data/lib/fluent/plugin/out_elasticsearch_data_stream.rb +215 -0
- data/test/plugin/test_out_elasticsearch_data_stream.rb +337 -0
- metadata +6 -3
- data/gemfiles/Gemfile.without.ilm +0 -10
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 5f2f8268d9a8a5acf6d941a915044bd3781a5722cea8bc7ac205d6b7fd6fe580
|
4
|
+
data.tar.gz: 7d9373fb963040efac0ea42f11bf1841f2e0b2a39b5566f5e2794507e77e5f5c
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 0e432748181717cedfa55239d2da7b0141c3280da7240f6f9643db411f1f3168f00f88c6933d65a5d1b944620dab0d87ce088f4dba76fdf17cec336ca55e83bf
|
7
|
+
data.tar.gz: f44b2a14c5a13e1a59bd7d6926af13a9a9ba7b990594bca45de20071756d9ca88f7ee53d6ec00e09f0ad2ed054510a7803ac11c7b695275d37dbacb1e5d51457
|
data/.travis.yml
CHANGED
data/History.md
CHANGED
@@ -2,6 +2,13 @@
|
|
2
2
|
|
3
3
|
### [Unreleased]
|
4
4
|
|
5
|
+
### 5.0.0
|
6
|
+
- Support #retry_operate on data stream (#863)
|
7
|
+
- Support placeholder in @data\_stream\_name for @type elasticsearch\_data\_stream (#862)
|
8
|
+
- Extract troubleshooting section (#861)
|
9
|
+
- Fix unmatched `<source>` close tag (#860)
|
10
|
+
- Initial support for Elasticsearch Data Stream (#859)
|
11
|
+
|
5
12
|
### 4.3.3
|
6
13
|
- Handle invalid Elasticsearch::Client#info response (#855)
|
7
14
|
|
@@ -0,0 +1,601 @@
|
|
1
|
+
## Index
|
2
|
+
|
3
|
+
* [Troubleshooting](#troubleshooting)
|
4
|
+
+ [Cannot send events to elasticsearch](#cannot-send-events-to-elasticsearch)
|
5
|
+
+ [Cannot see detailed failure log](#cannot-see-detailed-failure-log)
|
6
|
+
+ [Cannot connect TLS enabled reverse Proxy](#cannot-connect-tls-enabled-reverse-proxy)
|
7
|
+
+ [Declined logs are resubmitted forever, why?](#declined-logs-are-resubmitted-forever-why)
|
8
|
+
+ [Suggested to install typhoeus gem, why?](#suggested-to-install-typhoeus-gem-why)
|
9
|
+
+ [Stopped to send events on k8s, why?](#stopped-to-send-events-on-k8s-why)
|
10
|
+
+ [Random 400 - Rejected by Elasticsearch is occured, why?](#random-400---rejected-by-elasticsearch-is-occured-why)
|
11
|
+
+ [Fluentd seems to hang if it unable to connect Elasticsearch, why?](#fluentd-seems-to-hang-if-it-unable-to-connect-elasticsearch-why)
|
12
|
+
+ [Enable Index Lifecycle Management](#enable-index-lifecycle-management)
|
13
|
+
+ [How to specify index codec](#how-to-specify-index-codec)
|
14
|
+
+ [Cannot push logs to Elasticsearch with connect_write timeout reached, why?](#cannot-push-logs-to-elasticsearch-with-connect_write-timeout-reached-why)
|
15
|
+
|
16
|
+
|
17
|
+
## Troubleshooting
|
18
|
+
|
19
|
+
### Cannot send events to Elasticsearch
|
20
|
+
|
21
|
+
A common cause of failure is that you are trying to connect to an Elasticsearch instance with an incompatible version.
|
22
|
+
|
23
|
+
For example, td-agent currently bundles the 6.x series of the [elasticsearch-ruby](https://github.com/elastic/elasticsearch-ruby) library. This means that your Elasticsearch server also needs to be 6.x. You can check the actual version of the client library installed on your system by executing the following command.
|
24
|
+
|
25
|
+
```
|
26
|
+
# For td-agent users
|
27
|
+
$ /usr/sbin/td-agent-gem list elasticsearch
|
28
|
+
# For standalone Fluentd users
|
29
|
+
$ fluent-gem list elasticsearch
|
30
|
+
```
|
31
|
+
Or, fluent-plugin-elasticsearch v2.11.7 or later, users can inspect version incompatibility with the `validate_client_version` option:
|
32
|
+
|
33
|
+
```
|
34
|
+
validate_client_version true
|
35
|
+
```
|
36
|
+
|
37
|
+
If you get the following error message, please consider to install compatible elasticsearch client gems:
|
38
|
+
|
39
|
+
```
|
40
|
+
Detected ES 5 but you use ES client 6.1.0.
|
41
|
+
Please consider to use 5.x series ES client.
|
42
|
+
```
|
43
|
+
|
44
|
+
For further details of the version compatibility issue, please read [the official manual](https://github.com/elastic/elasticsearch-ruby#compatibility).
|
45
|
+
|
46
|
+
### Cannot see detailed failure log
|
47
|
+
|
48
|
+
A common cause of failure is that you are trying to connect to an Elasticsearch instance with an incompatible ssl protocol version.
|
49
|
+
|
50
|
+
For example, `out_elasticsearch` set up ssl_version to TLSv1 due to historical reason.
|
51
|
+
Modern Elasticsearch ecosystem requests to communicate with TLS v1.2 or later.
|
52
|
+
But, in this case, `out_elasticsearch` conceals transporter part failure log by default.
|
53
|
+
If you want to acquire transporter log, please consider to set the following configuration:
|
54
|
+
|
55
|
+
```
|
56
|
+
with_transporter_log true
|
57
|
+
@log_level debug
|
58
|
+
```
|
59
|
+
|
60
|
+
Then, the following log is shown in Fluentd log:
|
61
|
+
|
62
|
+
```
|
63
|
+
2018-10-24 10:00:00 +0900 [error]: #0 [Faraday::ConnectionFailed] SSL_connect returned=1 errno=0 state=SSLv2/v3 read server hello A: unknown protocol (OpenSSL::SSL::SSLError) {:host=>"elasticsearch-host", :port=>80, :scheme=>"https", :user=>"elastic", :password=>"changeme", :protocol=>"https"}
|
64
|
+
```
|
65
|
+
|
66
|
+
This indicates that inappropriate TLS protocol version is used.
|
67
|
+
If you want to use TLS v1.2, please use `ssl_version` parameter like as:
|
68
|
+
|
69
|
+
```
|
70
|
+
ssl_version TLSv1_2
|
71
|
+
```
|
72
|
+
|
73
|
+
or, in v4.0.2 or later with Ruby 2.5 or later combination, the following congiuration is also valid:
|
74
|
+
|
75
|
+
```
|
76
|
+
ssl_max_version TLSv1_2
|
77
|
+
ssl_min_version TLSv1_2
|
78
|
+
```
|
79
|
+
|
80
|
+
### Cannot connect TLS enabled reverse Proxy
|
81
|
+
|
82
|
+
A common cause of failure is that you are trying to connect to an Elasticsearch instance behind nginx reverse proxy which uses an incompatible ssl protocol version.
|
83
|
+
|
84
|
+
For example, `out_elasticsearch` set up ssl_version to TLSv1 due to historical reason.
|
85
|
+
Nowadays, nginx reverse proxy uses TLS v1.2 or later for security reason.
|
86
|
+
But, in this case, `out_elasticsearch` conceals transporter part failure log by default.
|
87
|
+
|
88
|
+
If you set up nginx reverse proxy with TLS v1.2:
|
89
|
+
|
90
|
+
```
|
91
|
+
server {
|
92
|
+
listen <your IP address>:9400;
|
93
|
+
server_name <ES-Host>;
|
94
|
+
ssl on;
|
95
|
+
ssl_certificate /etc/ssl/certs/server-bundle.pem;
|
96
|
+
ssl_certificate_key /etc/ssl/private/server-key.pem;
|
97
|
+
ssl_client_certificate /etc/ssl/certs/ca.pem;
|
98
|
+
ssl_verify_client on;
|
99
|
+
ssl_verify_depth 2;
|
100
|
+
|
101
|
+
# Reference : https://cipherli.st/
|
102
|
+
ssl_protocols TLSv1.2;
|
103
|
+
ssl_prefer_server_ciphers on;
|
104
|
+
ssl_ciphers "EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH";
|
105
|
+
ssl_ecdh_curve secp384r1; # Requires nginx >= 1.1.0
|
106
|
+
ssl_session_cache shared:SSL:10m;
|
107
|
+
ssl_session_tickets off; # Requires nginx >= 1.5.9
|
108
|
+
ssl_stapling on; # Requires nginx >= 1.3.7
|
109
|
+
ssl_stapling_verify on; # Requires nginx => 1.3.7
|
110
|
+
resolver 127.0.0.1 valid=300s;
|
111
|
+
resolver_timeout 5s;
|
112
|
+
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload";
|
113
|
+
add_header X-Frame-Options DENY;
|
114
|
+
add_header X-Content-Type-Options nosniff;
|
115
|
+
|
116
|
+
client_max_body_size 64M;
|
117
|
+
keepalive_timeout 5;
|
118
|
+
|
119
|
+
location / {
|
120
|
+
proxy_set_header Host $host;
|
121
|
+
proxy_set_header X-Real-IP $remote_addr;
|
122
|
+
proxy_pass http://localhost:9200;
|
123
|
+
}
|
124
|
+
}
|
125
|
+
```
|
126
|
+
|
127
|
+
Then, nginx reverse proxy starts with TLSv1.2.
|
128
|
+
|
129
|
+
Fluentd suddenly dies with the following log:
|
130
|
+
```
|
131
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: log writing failed. execution expired
|
132
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/ssl_socket.rb:10:in `initialize': stack level too deep (SystemStackError)
|
133
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/connection.rb:429:in `new'
|
134
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/connection.rb:429:in `socket'
|
135
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/connection.rb:111:in `request_call'
|
136
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/middlewares/mock.rb:48:in `request_call'
|
137
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/middlewares/instrumentor.rb:26:in `request_call'
|
138
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/middlewares/base.rb:16:in `request_call'
|
139
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/middlewares/base.rb:16:in `request_call'
|
140
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/middlewares/base.rb:16:in `request_call'
|
141
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: ... 9266 levels...
|
142
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/td-agent/embedded/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
|
143
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.5/bin/fluentd:8:in `<top (required)>'
|
144
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/bin/fluentd:22:in `load'
|
145
|
+
Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/bin/fluentd:22:in `<main>'
|
146
|
+
Oct 31 9:44:45 <ES-Host> systemd[1]: fluentd.service: Control process exited, code=exited status=1
|
147
|
+
```
|
148
|
+
|
149
|
+
If you want to acquire transporter log, please consider to set the following configuration:
|
150
|
+
|
151
|
+
```
|
152
|
+
with_transporter_log true
|
153
|
+
@log_level debug
|
154
|
+
```
|
155
|
+
|
156
|
+
Then, the following log is shown in Fluentd log:
|
157
|
+
|
158
|
+
```
|
159
|
+
2018-10-31 10:00:57 +0900 [warn]: #7 [Faraday::ConnectionFailed] Attempt 2 connecting to {:host=>"<ES-Host>", :port=>9400, :scheme=>"https", :protocol=>"https"}
|
160
|
+
2018-10-31 10:00:57 +0900 [error]: #7 [Faraday::ConnectionFailed] Connection reset by peer - SSL_connect (Errno::ECONNRESET) {:host=>"<ES-Host>", :port=>9400, :scheme=>"https", :protocol=>"https"}
|
161
|
+
```
|
162
|
+
|
163
|
+
The above logs indicates that using incompatible SSL/TLS version between fluent-plugin-elasticsearch and nginx, which is reverse proxy, is root cause of this issue.
|
164
|
+
|
165
|
+
If you want to use TLS v1.2, please use `ssl_version` parameter like as:
|
166
|
+
|
167
|
+
```
|
168
|
+
ssl_version TLSv1_2
|
169
|
+
```
|
170
|
+
|
171
|
+
or, in v4.0.2 or later with Ruby 2.5 or later combination, the following congiuration is also valid:
|
172
|
+
|
173
|
+
```
|
174
|
+
ssl_max_version TLSv1_2
|
175
|
+
ssl_min_version TLSv1_2
|
176
|
+
```
|
177
|
+
|
178
|
+
### Declined logs are resubmitted forever, why?
|
179
|
+
|
180
|
+
Sometimes users write Fluentd configuration like this:
|
181
|
+
|
182
|
+
```aconf
|
183
|
+
<match **>
|
184
|
+
@type elasticsearch
|
185
|
+
host localhost
|
186
|
+
port 9200
|
187
|
+
type_name fluentd
|
188
|
+
logstash_format true
|
189
|
+
time_key @timestamp
|
190
|
+
include_timestamp true
|
191
|
+
reconnect_on_error true
|
192
|
+
reload_on_failure true
|
193
|
+
reload_connections false
|
194
|
+
request_timeout 120s
|
195
|
+
</match>
|
196
|
+
```
|
197
|
+
|
198
|
+
The above configuration does not use [`@label` feature](https://docs.fluentd.org/v1.0/articles/config-file#(5)-group-filter-and-output:-the-%E2%80%9Clabel%E2%80%9D-directive) and use glob(**) pattern.
|
199
|
+
It is usually problematic configuration.
|
200
|
+
|
201
|
+
In error scenario, error events will be emitted with `@ERROR` label, and `fluent.*` tag.
|
202
|
+
The black hole glob pattern resubmits a problematic event into pushing Elasticsearch pipeline.
|
203
|
+
|
204
|
+
This situation causes flood of declined log:
|
205
|
+
|
206
|
+
```log
|
207
|
+
2018-11-13 11:16:27 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::ElasticsearchErrorHandler::ElasticsearchError error="400 - Rejected by Elasticsearch" location=nil tag="app.fluentcat" time=2018-11-13 11:16:17.492985640 +0000 record={"message"=>"\xFF\xAD"}
|
208
|
+
2018-11-13 11:16:38 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::ElasticsearchErrorHandler::ElasticsearchError error="400 - Rejected by Elasticsearch" location=nil tag="fluent.warn" time=2018-11-13 11:16:27.978851140 +0000 record={"error"=>"#<Fluent::Plugin::ElasticsearchErrorHandler::ElasticsearchError: 400 - Rejected by Elasticsearch>", "location"=>nil, "tag"=>"app.fluentcat", "time"=>2018-11-13 11:16:17.492985640 +0000, "record"=>{"message"=>"\xFF\xAD"}, "message"=>"dump an error event: error_class=Fluent::Plugin::ElasticsearchErrorHandler::ElasticsearchError error=\"400 - Rejected by Elasticsearch\" location=nil tag=\"app.fluentcat\" time=2018-11-13 11:16:17.492985640 +0000 record={\"message\"=>\"\\xFF\\xAD\"}"}
|
209
|
+
```
|
210
|
+
|
211
|
+
Then, user should use more concrete tag route or use `@label`.
|
212
|
+
The following sections show two examples how to solve flood of declined log.
|
213
|
+
One is using concrete tag routing, the other is using label routing.
|
214
|
+
|
215
|
+
#### Using concrete tag routing
|
216
|
+
|
217
|
+
The following configuration uses concrete tag route:
|
218
|
+
|
219
|
+
```aconf
|
220
|
+
<match out.elasticsearch.**>
|
221
|
+
@type elasticsearch
|
222
|
+
host localhost
|
223
|
+
port 9200
|
224
|
+
type_name fluentd
|
225
|
+
logstash_format true
|
226
|
+
time_key @timestamp
|
227
|
+
include_timestamp true
|
228
|
+
reconnect_on_error true
|
229
|
+
reload_on_failure true
|
230
|
+
reload_connections false
|
231
|
+
request_timeout 120s
|
232
|
+
</match>
|
233
|
+
```
|
234
|
+
|
235
|
+
#### Using label feature
|
236
|
+
|
237
|
+
The following configuration uses label:
|
238
|
+
|
239
|
+
```aconf
|
240
|
+
<source>
|
241
|
+
@type forward
|
242
|
+
@label @ES
|
243
|
+
</source>
|
244
|
+
<label @ES>
|
245
|
+
<match out.elasticsearch.**>
|
246
|
+
@type elasticsearch
|
247
|
+
host localhost
|
248
|
+
port 9200
|
249
|
+
type_name fluentd
|
250
|
+
logstash_format true
|
251
|
+
time_key @timestamp
|
252
|
+
include_timestamp true
|
253
|
+
reconnect_on_error true
|
254
|
+
reload_on_failure true
|
255
|
+
reload_connections false
|
256
|
+
request_timeout 120s
|
257
|
+
</match>
|
258
|
+
</label>
|
259
|
+
<label @ERROR>
|
260
|
+
<match **>
|
261
|
+
@type stdout
|
262
|
+
</match>
|
263
|
+
</label>
|
264
|
+
```
|
265
|
+
|
266
|
+
### Suggested to install typhoeus gem, why?
|
267
|
+
|
268
|
+
fluent-plugin-elasticsearch doesn't depend on typhoeus gem by default.
|
269
|
+
If you want to use typhoeus backend, you must install typhoeus gem by your own.
|
270
|
+
|
271
|
+
If you use vanilla Fluentd, you can install it by:
|
272
|
+
|
273
|
+
```
|
274
|
+
gem install typhoeus
|
275
|
+
```
|
276
|
+
|
277
|
+
But, you use td-agent instead of vanilla Fluentd, you have to use `td-agent-gem`:
|
278
|
+
|
279
|
+
```
|
280
|
+
td-agent-gem install typhoeus
|
281
|
+
```
|
282
|
+
|
283
|
+
In more detail, please refer to [the official plugin management document](https://docs.fluentd.org/v1.0/articles/plugin-management).
|
284
|
+
|
285
|
+
### Stopped to send events on k8s, why?
|
286
|
+
|
287
|
+
fluent-plugin-elasticsearch reloads connection after 10000 requests. (Not correspond to events counts because ES plugin uses bulk API.)
|
288
|
+
|
289
|
+
This functionality which is originated from elasticsearch-ruby gem is enabled by default.
|
290
|
+
|
291
|
+
Sometimes this reloading functionality bothers users to send events with ES plugin.
|
292
|
+
|
293
|
+
On k8s platform, users sometimes shall specify the following settings:
|
294
|
+
|
295
|
+
```aconf
|
296
|
+
reload_connections false
|
297
|
+
reconnect_on_error true
|
298
|
+
reload_on_failure true
|
299
|
+
```
|
300
|
+
|
301
|
+
If you use [fluentd-kubernetes-daemonset](https://github.com/fluent/fluentd-kubernetes-daemonset), you can specify them with environment variables:
|
302
|
+
|
303
|
+
* `FLUENT_ELASTICSEARCH_RELOAD_CONNECTIONS` as `false`
|
304
|
+
* `FLUENT_ELASTICSEARCH_RECONNECT_ON_ERROR` as `true`
|
305
|
+
* `FLUENT_ELASTICSEARCH_RELOAD_ON_FAILURE` as `true`
|
306
|
+
|
307
|
+
This issue had been reported at [#525](https://github.com/uken/fluent-plugin-elasticsearch/issues/525).
|
308
|
+
|
309
|
+
### Random 400 - Rejected by Elasticsearch is occured, why?
|
310
|
+
|
311
|
+
Index templates installed Elasticsearch sometimes generates 400 - Rejected by Elasticsearch errors.
|
312
|
+
For example, kubernetes audit log has structure:
|
313
|
+
|
314
|
+
```json
|
315
|
+
"responseObject":{
|
316
|
+
"kind":"SubjectAccessReview",
|
317
|
+
"apiVersion":"authorization.k8s.io/v1beta1",
|
318
|
+
"metadata":{
|
319
|
+
"creationTimestamp":null
|
320
|
+
},
|
321
|
+
"spec":{
|
322
|
+
"nonResourceAttributes":{
|
323
|
+
"path":"/",
|
324
|
+
"verb":"get"
|
325
|
+
},
|
326
|
+
"user":"system:anonymous",
|
327
|
+
"group":[
|
328
|
+
"system:unauthenticated"
|
329
|
+
]
|
330
|
+
},
|
331
|
+
"status":{
|
332
|
+
"allowed":true,
|
333
|
+
"reason":"RBAC: allowed by ClusterRoleBinding \"cluster-system-anonymous\" of ClusterRole \"cluster-admin\" to User \"system:anonymous\""
|
334
|
+
}
|
335
|
+
},
|
336
|
+
```
|
337
|
+
|
338
|
+
The last element `status` sometimes becomes `"status":"Success"`.
|
339
|
+
This element type glich causes status 400 error.
|
340
|
+
|
341
|
+
There are some solutions for fixing this:
|
342
|
+
|
343
|
+
#### Solution 1
|
344
|
+
|
345
|
+
For a key which causes element type glich case.
|
346
|
+
|
347
|
+
Using dymanic mapping with the following template:
|
348
|
+
|
349
|
+
```json
|
350
|
+
{
|
351
|
+
"template": "YOURINDEXNAME-*",
|
352
|
+
"mappings": {
|
353
|
+
"fluentd": {
|
354
|
+
"dynamic_templates": [
|
355
|
+
{
|
356
|
+
"default_no_index": {
|
357
|
+
"path_match": "^.*$",
|
358
|
+
"path_unmatch": "^(@timestamp|auditID|level|stage|requestURI|sourceIPs|metadata|objectRef|user|verb)(\\..+)?$",
|
359
|
+
"match_pattern": "regex",
|
360
|
+
"mapping": {
|
361
|
+
"index": false,
|
362
|
+
"enabled": false
|
363
|
+
}
|
364
|
+
}
|
365
|
+
}
|
366
|
+
]
|
367
|
+
}
|
368
|
+
}
|
369
|
+
}
|
370
|
+
```
|
371
|
+
|
372
|
+
Note that `YOURINDEXNAME` should be replaced with your using index prefix.
|
373
|
+
|
374
|
+
#### Solution 2
|
375
|
+
|
376
|
+
For unstable `responseObject` and `requestObject` key existence case.
|
377
|
+
|
378
|
+
```aconf
|
379
|
+
<filter YOURROUTETAG>
|
380
|
+
@id kube_api_audit_normalize
|
381
|
+
@type record_transformer
|
382
|
+
auto_typecast false
|
383
|
+
enable_ruby true
|
384
|
+
<record>
|
385
|
+
host "#{ENV['K8S_NODE_NAME']}"
|
386
|
+
responseObject ${record["responseObject"].nil? ? "none": record["responseObject"].to_json}
|
387
|
+
requestObject ${record["requestObject"].nil? ? "none": record["requestObject"].to_json}
|
388
|
+
origin kubernetes-api-audit
|
389
|
+
</record>
|
390
|
+
</filter>
|
391
|
+
```
|
392
|
+
|
393
|
+
Normalize `responseObject` and `requestObject` key with record_transformer and other similiar plugins is needed.
|
394
|
+
|
395
|
+
### Fluentd seems to hang if it unable to connect Elasticsearch, why?
|
396
|
+
|
397
|
+
On `#configure` phase, ES plugin should wait until ES instance communication is succeeded.
|
398
|
+
And ES plugin blocks to launch Fluentd by default.
|
399
|
+
Because Fluentd requests to set up configuration correctly on `#configure` phase.
|
400
|
+
|
401
|
+
After `#configure` phase, it runs very fast and send events heavily in some heavily using case.
|
402
|
+
|
403
|
+
In this scenario, we need to set up configuration correctly until `#configure` phase.
|
404
|
+
So, we provide default parameter is too conservative to use advanced users.
|
405
|
+
|
406
|
+
To remove too pessimistic behavior, you can use the following configuration:
|
407
|
+
|
408
|
+
```aconf
|
409
|
+
<match **>
|
410
|
+
@type elasticsearch
|
411
|
+
# Some advanced users know their using ES version.
|
412
|
+
# We can disable startup ES version checking.
|
413
|
+
verify_es_version_at_startup false
|
414
|
+
# If you know that your using ES major version is 7, you can set as 7 here.
|
415
|
+
default_elasticsearch_version 7
|
416
|
+
# If using very stable ES cluster, you can reduce retry operation counts. (minmum is 1)
|
417
|
+
max_retry_get_es_version 1
|
418
|
+
# If using very stable ES cluster, you can reduce retry operation counts. (minmum is 1)
|
419
|
+
max_retry_putting_template 1
|
420
|
+
# ... and some ES plugin configuration
|
421
|
+
</match>
|
422
|
+
```
|
423
|
+
|
424
|
+
### Enable Index Lifecycle Management
|
425
|
+
|
426
|
+
Index lifecycle management is template based index management feature.
|
427
|
+
|
428
|
+
Main ILM feature parameters are:
|
429
|
+
|
430
|
+
* `index_name` (when logstash_format as false)
|
431
|
+
* `logstash_prefix` (when logstash_format as true)
|
432
|
+
* `enable_ilm`
|
433
|
+
* `ilm_policy_id`
|
434
|
+
* `ilm_policy`
|
435
|
+
|
436
|
+
* Advanced usage parameters
|
437
|
+
* `application_name`
|
438
|
+
* `index_separator`
|
439
|
+
|
440
|
+
They are not all mandatory parameters but they are used for ILM feature in effect.
|
441
|
+
|
442
|
+
ILM target index alias is created with `index_name` or an index which is calculated from `logstash_prefix`.
|
443
|
+
|
444
|
+
From Elasticsearch plugin v4.0.0, ILM target index will be calculated from `index_name` (normal mode) or `logstash_prefix` (using with `logstash_format`as true).
|
445
|
+
|
446
|
+
**NOTE:** Before Elasticsearch plugin v4.1.0, using `deflector_alias` parameter when ILM is enabled is permitted and handled, but, in the later releases such that 4.1.1 or later, it cannot use with when ILM is enabled.
|
447
|
+
|
448
|
+
And also, ILM feature users should specify their Elasticsearch template for ILM enabled indices.
|
449
|
+
Because ILM settings are injected into their Elasticsearch templates.
|
450
|
+
|
451
|
+
`application_name` and `index_separator` also affect alias index names.
|
452
|
+
|
453
|
+
But this parameter is prepared for advanced usage.
|
454
|
+
|
455
|
+
It usually should be used with default value which is `default`.
|
456
|
+
|
457
|
+
Then, ILM parameters are used in alias index like as:
|
458
|
+
|
459
|
+
##### Simple `index_name` case:
|
460
|
+
|
461
|
+
`<index_name><index_separator><application_name>-000001`.
|
462
|
+
|
463
|
+
##### `logstash_format` as `true` case:
|
464
|
+
|
465
|
+
`<logstash_prefix><logstash_prefix_separator><application_name><logstash_prefix_separator><logstash_dateformat>-000001`.
|
466
|
+
|
467
|
+
#### Example ILM settings
|
468
|
+
|
469
|
+
```aconf
|
470
|
+
index_name fluentd-${tag}
|
471
|
+
application_name ${tag}
|
472
|
+
index_date_pattern "now/d"
|
473
|
+
enable_ilm true
|
474
|
+
# Policy configurations
|
475
|
+
ilm_policy_id fluentd-policy
|
476
|
+
# ilm_policy {} # Use default policy
|
477
|
+
template_name your-fluentd-template
|
478
|
+
template_file /path/to/fluentd-template.json
|
479
|
+
# customize_template {"<<index_prefix>>": "fluentd"}
|
480
|
+
```
|
481
|
+
|
482
|
+
Note: This plugin only creates rollover-enabled indices, which are aliases pointing to them and index templates, and creates an ILM policy if enabled.
|
483
|
+
|
484
|
+
#### Create ILM indices in each day
|
485
|
+
|
486
|
+
If you want to create new index in each day, you should use `logstash_format` style configuration:
|
487
|
+
|
488
|
+
```aconf
|
489
|
+
logstash_prefix fluentd
|
490
|
+
application_name default
|
491
|
+
index_date_pattern "now/d"
|
492
|
+
enable_ilm true
|
493
|
+
# Policy configurations
|
494
|
+
ilm_policy_id fluentd-policy
|
495
|
+
# ilm_policy {} # Use default policy
|
496
|
+
template_name your-fluentd-template
|
497
|
+
template_file /path/to/fluentd-template.json
|
498
|
+
```
|
499
|
+
|
500
|
+
Note that if you create a new set of indexes every day, the elasticsearch ILM policy system will treat each day separately and will always
|
501
|
+
maintain a separate active write index for each day.
|
502
|
+
|
503
|
+
If you have a rollover based on max_age, it will continue to roll the indexes for prior dates even if no new documents are indexed. If you want
|
504
|
+
to delete indexes after a period of time, the ILM policy will never delete the current write index regardless of its age, so you would need a separate
|
505
|
+
system, such as curator, to actually delete the old indexes.
|
506
|
+
|
507
|
+
For this reason, if you put the date into the index names with ILM you should only rollover based on size or number of documents and may need to use
|
508
|
+
curator to actually delete old indexes.
|
509
|
+
|
510
|
+
#### Fixed ILM indices
|
511
|
+
|
512
|
+
Also, users can use fixed ILM indices configuration.
|
513
|
+
If `index_date_pattern` is set as `""`(empty string), Elasticsearch plugin won't attach date pattern in ILM indices:
|
514
|
+
|
515
|
+
```aconf
|
516
|
+
index_name fluentd
|
517
|
+
application_name default
|
518
|
+
index_date_pattern ""
|
519
|
+
enable_ilm true
|
520
|
+
# Policy configurations
|
521
|
+
ilm_policy_id fluentd-policy
|
522
|
+
# ilm_policy {} # Use default policy
|
523
|
+
template_name your-fluentd-template
|
524
|
+
template_file /path/to/fluentd-template.json
|
525
|
+
```
|
526
|
+
|
527
|
+
### How to specify index codec
|
528
|
+
|
529
|
+
Elasticsearch can handle compression methods for stored data such as LZ4 and best_compression.
|
530
|
+
fluent-plugin-elasticsearch doesn't provide API which specifies compression method.
|
531
|
+
|
532
|
+
Users can specify stored data compression method with template:
|
533
|
+
|
534
|
+
Create `compression.json` as follows:
|
535
|
+
|
536
|
+
```json
|
537
|
+
{
|
538
|
+
"order": 100,
|
539
|
+
"index_patterns": [
|
540
|
+
"YOUR-INDEX-PATTERN"
|
541
|
+
],
|
542
|
+
"settings": {
|
543
|
+
"index": {
|
544
|
+
"codec": "best_compression"
|
545
|
+
}
|
546
|
+
}
|
547
|
+
}
|
548
|
+
```
|
549
|
+
|
550
|
+
Then, specify the above template in your configuration:
|
551
|
+
|
552
|
+
```aconf
|
553
|
+
template_name best_compression_tmpl
|
554
|
+
template_file compression.json
|
555
|
+
```
|
556
|
+
|
557
|
+
Elasticsearch will store data with `best_compression`:
|
558
|
+
|
559
|
+
```
|
560
|
+
% curl -XGET 'http://localhost:9200/logstash-2019.12.06/_settings?pretty'
|
561
|
+
```
|
562
|
+
|
563
|
+
```json
|
564
|
+
{
|
565
|
+
"logstash-2019.12.06" : {
|
566
|
+
"settings" : {
|
567
|
+
"index" : {
|
568
|
+
"codec" : "best_compression",
|
569
|
+
"number_of_shards" : "1",
|
570
|
+
"provided_name" : "logstash-2019.12.06",
|
571
|
+
"creation_date" : "1575622843800",
|
572
|
+
"number_of_replicas" : "1",
|
573
|
+
"uuid" : "THE_AWESOMEUUID",
|
574
|
+
"version" : {
|
575
|
+
"created" : "7040100"
|
576
|
+
}
|
577
|
+
}
|
578
|
+
}
|
579
|
+
}
|
580
|
+
}
|
581
|
+
```
|
582
|
+
|
583
|
+
### Cannot push logs to Elasticsearch with connect_write timeout reached, why?
|
584
|
+
|
585
|
+
It seems that Elasticsearch cluster is exhausted.
|
586
|
+
|
587
|
+
Usually, Fluentd complains like the following log:
|
588
|
+
|
589
|
+
```log
|
590
|
+
2019-12-29 00:23:33 +0000 [warn]: buffer flush took longer time than slow_flush_log_threshold: elapsed_time=27.283766102716327 slow_flush_log_threshold=15.0 plugin_id="object:aaaffaaaaaff"
|
591
|
+
2019-12-29 00:23:33 +0000 [warn]: buffer flush took longer time than slow_flush_log_threshold: elapsed_time=26.161768959928304 slow_flush_log_threshold=15.0 plugin_id="object:aaaffaaaaaff"
|
592
|
+
2019-12-29 00:23:33 +0000 [warn]: buffer flush took longer time than slow_flush_log_threshold: elapsed_time=28.713624476008117 slow_flush_log_threshold=15.0 plugin_id="object:aaaffaaaaaff"
|
593
|
+
2019-12-29 01:39:18 +0000 [warn]: Could not push logs to Elasticsearch, resetting connection and trying again. connect_write timeout reached
|
594
|
+
2019-12-29 01:39:18 +0000 [warn]: Could not push logs to Elasticsearch, resetting connection and trying again. connect_write timeout reached
|
595
|
+
```
|
596
|
+
|
597
|
+
This warnings is usually caused by exhaused Elasticsearch cluster due to resource shortage.
|
598
|
+
|
599
|
+
If CPU usage is spiked and Elasticsearch cluster is eating up CPU resource, this issue is caused by CPU resource shortage.
|
600
|
+
|
601
|
+
Check your Elasticsearch cluster health status and resource usage.
|