fluent-plugin-opensearch 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (52) hide show
  1. checksums.yaml +7 -0
  2. data/.coveralls.yml +1 -0
  3. data/.editorconfig +9 -0
  4. data/.github/ISSUE_TEMPLATE/bug_report.md +37 -0
  5. data/.github/ISSUE_TEMPLATE/feature_request.md +24 -0
  6. data/.github/workflows/coverage.yaml +22 -0
  7. data/.github/workflows/issue-auto-closer.yml +12 -0
  8. data/.github/workflows/linux.yml +26 -0
  9. data/.github/workflows/macos.yml +26 -0
  10. data/.github/workflows/windows.yml +26 -0
  11. data/.gitignore +18 -0
  12. data/CONTRIBUTING.md +24 -0
  13. data/Gemfile +10 -0
  14. data/History.md +6 -0
  15. data/ISSUE_TEMPLATE.md +26 -0
  16. data/LICENSE.txt +201 -0
  17. data/PULL_REQUEST_TEMPLATE.md +9 -0
  18. data/README.OpenSearchGenID.md +116 -0
  19. data/README.OpenSearchInput.md +291 -0
  20. data/README.Troubleshooting.md +482 -0
  21. data/README.md +1556 -0
  22. data/Rakefile +37 -0
  23. data/fluent-plugin-opensearch.gemspec +38 -0
  24. data/gemfiles/Gemfile.elasticsearch.v6 +12 -0
  25. data/lib/fluent/log-ext.rb +64 -0
  26. data/lib/fluent/plugin/filter_opensearch_genid.rb +103 -0
  27. data/lib/fluent/plugin/in_opensearch.rb +351 -0
  28. data/lib/fluent/plugin/oj_serializer.rb +48 -0
  29. data/lib/fluent/plugin/opensearch_constants.rb +39 -0
  30. data/lib/fluent/plugin/opensearch_error.rb +31 -0
  31. data/lib/fluent/plugin/opensearch_error_handler.rb +166 -0
  32. data/lib/fluent/plugin/opensearch_fallback_selector.rb +36 -0
  33. data/lib/fluent/plugin/opensearch_index_template.rb +155 -0
  34. data/lib/fluent/plugin/opensearch_simple_sniffer.rb +36 -0
  35. data/lib/fluent/plugin/opensearch_tls.rb +96 -0
  36. data/lib/fluent/plugin/out_opensearch.rb +1124 -0
  37. data/lib/fluent/plugin/out_opensearch_data_stream.rb +214 -0
  38. data/test/helper.rb +61 -0
  39. data/test/plugin/test_alias_template.json +9 -0
  40. data/test/plugin/test_filter_opensearch_genid.rb +241 -0
  41. data/test/plugin/test_in_opensearch.rb +493 -0
  42. data/test/plugin/test_index_alias_template.json +11 -0
  43. data/test/plugin/test_index_template.json +25 -0
  44. data/test/plugin/test_oj_serializer.rb +45 -0
  45. data/test/plugin/test_opensearch_error_handler.rb +689 -0
  46. data/test/plugin/test_opensearch_fallback_selector.rb +100 -0
  47. data/test/plugin/test_opensearch_tls.rb +171 -0
  48. data/test/plugin/test_out_opensearch.rb +3953 -0
  49. data/test/plugin/test_out_opensearch_data_stream.rb +474 -0
  50. data/test/plugin/test_template.json +23 -0
  51. data/test/test_log-ext.rb +61 -0
  52. metadata +262 -0
@@ -0,0 +1,482 @@
1
+ ## Index
2
+
3
+ * [Troubleshooting](#troubleshooting)
4
+ + [Cannot send events to opensearch](#cannot-send-events-to-opensearch)
5
+ + [Cannot see detailed failure log](#cannot-see-detailed-failure-log)
6
+ + [Cannot connect TLS enabled reverse Proxy](#cannot-connect-tls-enabled-reverse-proxy)
7
+ + [Declined logs are resubmitted forever, why?](#declined-logs-are-resubmitted-forever-why)
8
+ + [Suggested to install typhoeus gem, why?](#suggested-to-install-typhoeus-gem-why)
9
+ + [Random 400 - Rejected by OpenSearch is occured, why?](#random-400---rejected-by-opensearch-is-occured-why)
10
+ + [Fluentd seems to hang if it unable to connect OpenSearch, why?](#fluentd-seems-to-hang-if-it-unable-to-connect-opensearch-why)
11
+ + [How to specify index codec](#how-to-specify-index-codec)
12
+ + [Cannot push logs to OpenSearch with connect_write timeout reached, why?](#cannot-push-logs-to-opensearch-with-connect_write-timeout-reached-why)
13
+ + [Index State Management feature is not provided, why?](#index-state-management-feature-is-not-provided-why)
14
+
15
+
16
+ ## Troubleshooting
17
+
18
+ ### Cannot send events to OpenSearch
19
+
20
+ A common cause of failure is that you are trying to connect to an OpenSearch instance with an incompatible version.
21
+
22
+ You can check the actual version of the client library installed on your system by executing the following command.
23
+
24
+ ```
25
+ # For td-agent users
26
+ $ /usr/sbin/td-agent-gem list opensearch
27
+ # For standalone Fluentd users
28
+ $ fluent-gem list opensearch
29
+ ```
30
+ Or, fluent-plugin-opensearch v0.1.0 or later, users can inspect version incompatibility with the `validate_client_version` option:
31
+
32
+ ```
33
+ validate_client_version true
34
+ ```
35
+
36
+ If you get the following error message, please consider to install compatible opensearch client gems:
37
+
38
+ ```
39
+ Detected OpenSearch 1 but you use OpenSearch client 2.0.0.
40
+ Please consider to use 1.x series OpenSearch client.
41
+ ```
42
+
43
+ ### Cannot see detailed failure log
44
+
45
+ A common cause of failure is that you are trying to connect to an OpenSearch instance with an incompatible ssl protocol version.
46
+
47
+ For example, `out_opensearch` set up ssl_version to TLSv1 due to historical reason.
48
+ Modern OpenSearch ecosystem requests to communicate with TLS v1.2 or later.
49
+ But, in this case, `out_opensearch` conceals transporter part failure log by default.
50
+ If you want to acquire transporter log, please consider to set the following configuration:
51
+
52
+ ```
53
+ with_transporter_log true
54
+ @log_level debug
55
+ ```
56
+
57
+ Then, the following log is shown in Fluentd log:
58
+
59
+ ```
60
+ 2018-10-24 10:00:00 +0900 [error]: #0 [Faraday::ConnectionFailed] SSL_connect returned=1 errno=0 state=SSLv2/v3 read server hello A: unknown protocol (OpenSSL::SSL::SSLError) {:host=>"opensearch-host", :port=>80, :scheme=>"https", :user=>"elastic", :password=>"changeme", :protocol=>"https"}
61
+ ```
62
+
63
+ This indicates that inappropriate TLS protocol version is used.
64
+ If you want to use TLS v1.2, please use `ssl_version` parameter like as:
65
+
66
+ ```
67
+ ssl_version TLSv1_2
68
+ ```
69
+
70
+ or, in v4.0.2 or later with Ruby 2.5 or later combination, the following congiuration is also valid:
71
+
72
+ ```
73
+ ssl_max_version TLSv1_2
74
+ ssl_min_version TLSv1_2
75
+ ```
76
+
77
+ ### Cannot connect TLS enabled reverse Proxy
78
+
79
+ A common cause of failure is that you are trying to connect to an OpenSearch instance behind nginx reverse proxy which uses an incompatible ssl protocol version.
80
+
81
+ For example, `out_opensearch` set up ssl_version to TLSv1 due to historical reason.
82
+ Nowadays, nginx reverse proxy uses TLS v1.2 or later for security reason.
83
+ But, in this case, `out_opensearch` conceals transporter part failure log by default.
84
+
85
+ If you set up nginx reverse proxy with TLS v1.2:
86
+
87
+ ```
88
+ server {
89
+ listen <your IP address>:9400;
90
+ server_name <ES-Host>;
91
+ ssl on;
92
+ ssl_certificate /etc/ssl/certs/server-bundle.pem;
93
+ ssl_certificate_key /etc/ssl/private/server-key.pem;
94
+ ssl_client_certificate /etc/ssl/certs/ca.pem;
95
+ ssl_verify_client on;
96
+ ssl_verify_depth 2;
97
+
98
+ # Reference : https://cipherli.st/
99
+ ssl_protocols TLSv1.2;
100
+ ssl_prefer_server_ciphers on;
101
+ ssl_ciphers "EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH";
102
+ ssl_ecdh_curve secp384r1; # Requires nginx >= 1.1.0
103
+ ssl_session_cache shared:SSL:10m;
104
+ ssl_session_tickets off; # Requires nginx >= 1.5.9
105
+ ssl_stapling on; # Requires nginx >= 1.3.7
106
+ ssl_stapling_verify on; # Requires nginx => 1.3.7
107
+ resolver 127.0.0.1 valid=300s;
108
+ resolver_timeout 5s;
109
+ add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload";
110
+ add_header X-Frame-Options DENY;
111
+ add_header X-Content-Type-Options nosniff;
112
+
113
+ client_max_body_size 64M;
114
+ keepalive_timeout 5;
115
+
116
+ location / {
117
+ proxy_set_header Host $host;
118
+ proxy_set_header X-Real-IP $remote_addr;
119
+ proxy_pass http://localhost:9200;
120
+ }
121
+ }
122
+ ```
123
+
124
+ Then, nginx reverse proxy starts with TLSv1.2.
125
+
126
+ Fluentd suddenly dies with the following log:
127
+ ```
128
+ Oct 31 9:44:45 <ES-Host> fluentd[6442]: log writing failed. execution expired
129
+ Oct 31 9:44:45 <ES-Host> fluentd[6442]: /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/ssl_socket.rb:10:in `initialize': stack level too deep (SystemStackError)
130
+ Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/connection.rb:429:in `new'
131
+ Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/connection.rb:429:in `socket'
132
+ Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/connection.rb:111:in `request_call'
133
+ Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/middlewares/mock.rb:48:in `request_call'
134
+ Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/middlewares/instrumentor.rb:26:in `request_call'
135
+ Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/middlewares/base.rb:16:in `request_call'
136
+ Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/middlewares/base.rb:16:in `request_call'
137
+ Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/excon-0.62.0/lib/excon/middlewares/base.rb:16:in `request_call'
138
+ Oct 31 9:44:45 <ES-Host> fluentd[6442]: ... 9266 levels...
139
+ Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/td-agent/embedded/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
140
+ Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.5/bin/fluentd:8:in `<top (required)>'
141
+ Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/bin/fluentd:22:in `load'
142
+ Oct 31 9:44:45 <ES-Host> fluentd[6442]: from /opt/fluentd/embedded/bin/fluentd:22:in `<main>'
143
+ Oct 31 9:44:45 <ES-Host> systemd[1]: fluentd.service: Control process exited, code=exited status=1
144
+ ```
145
+
146
+ If you want to acquire transporter log, please consider to set the following configuration:
147
+
148
+ ```
149
+ with_transporter_log true
150
+ @log_level debug
151
+ ```
152
+
153
+ Then, the following log is shown in Fluentd log:
154
+
155
+ ```
156
+ 2018-10-31 10:00:57 +0900 [warn]: #7 [Faraday::ConnectionFailed] Attempt 2 connecting to {:host=>"<ES-Host>", :port=>9400, :scheme=>"https", :protocol=>"https"}
157
+ 2018-10-31 10:00:57 +0900 [error]: #7 [Faraday::ConnectionFailed] Connection reset by peer - SSL_connect (Errno::ECONNRESET) {:host=>"<ES-Host>", :port=>9400, :scheme=>"https", :protocol=>"https"}
158
+ ```
159
+
160
+ The above logs indicates that using incompatible SSL/TLS version between fluent-plugin-opensearch and nginx, which is reverse proxy, is root cause of this issue.
161
+
162
+ If you want to use TLS v1.2, please use `ssl_version` parameter like as:
163
+
164
+ ```
165
+ ssl_version TLSv1_2
166
+ ```
167
+
168
+ or, in v4.0.2 or later with Ruby 2.5 or later combination, the following congiuration is also valid:
169
+
170
+ ```
171
+ ssl_max_version TLSv1_2
172
+ ssl_min_version TLSv1_2
173
+ ```
174
+
175
+ ### Declined logs are resubmitted forever, why?
176
+
177
+ Sometimes users write Fluentd configuration like this:
178
+
179
+ ```aconf
180
+ <match **>
181
+ @type opensearch
182
+ host localhost
183
+ port 9200
184
+ type_name fluentd
185
+ logstash_format true
186
+ time_key @timestamp
187
+ include_timestamp true
188
+ reconnect_on_error true
189
+ reload_on_failure true
190
+ reload_connections false
191
+ request_timeout 120s
192
+ </match>
193
+ ```
194
+
195
+ The above configuration does not use [`@label` feature](https://docs.fluentd.org/v1.0/articles/config-file#(5)-group-filter-and-output:-the-%E2%80%9Clabel%E2%80%9D-directive) and use glob(**) pattern.
196
+ It is usually problematic configuration.
197
+
198
+ In error scenario, error events will be emitted with `@ERROR` label, and `fluent.*` tag.
199
+ The black hole glob pattern resubmits a problematic event into pushing OpenSearch pipeline.
200
+
201
+ This situation causes flood of declined log:
202
+
203
+ ```log
204
+ 2018-11-13 11:16:27 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::OpenSearchErrorHandler::OpenSearchError error="400 - Rejected by OpenSearch" location=nil tag="app.fluentcat" time=2018-11-13 11:16:17.492985640 +0000 record={"message"=>"\xFF\xAD"}
205
+ 2018-11-13 11:16:38 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::OpenSearchErrorHandler::OpenSearchError error="400 - Rejected by OpenSearch" location=nil tag="fluent.warn" time=2018-11-13 11:16:27.978851140 +0000 record={"error"=>"#<Fluent::Plugin::OpenSearchErrorHandler::OpenSearchError: 400 - Rejected by OpenSearch>", "location"=>nil, "tag"=>"app.fluentcat", "time"=>2018-11-13 11:16:17.492985640 +0000, "record"=>{"message"=>"\xFF\xAD"}, "message"=>"dump an error event: error_class=Fluent::Plugin::OpenSearchErrorHandler::OpenSearchError error=\"400 - Rejected by OpenSearch\" location=nil tag=\"app.fluentcat\" time=2018-11-13 11:16:17.492985640 +0000 record={\"message\"=>\"\\xFF\\xAD\"}"}
206
+ ```
207
+
208
+ Then, user should use more concrete tag route or use `@label`.
209
+ The following sections show two examples how to solve flood of declined log.
210
+ One is using concrete tag routing, the other is using label routing.
211
+
212
+ #### Using concrete tag routing
213
+
214
+ The following configuration uses concrete tag route:
215
+
216
+ ```aconf
217
+ <match out.opensearch.**>
218
+ @type opensearch
219
+ host localhost
220
+ port 9200
221
+ type_name fluentd
222
+ logstash_format true
223
+ time_key @timestamp
224
+ include_timestamp true
225
+ reconnect_on_error true
226
+ reload_on_failure true
227
+ reload_connections false
228
+ request_timeout 120s
229
+ </match>
230
+ ```
231
+
232
+ #### Using label feature
233
+
234
+ The following configuration uses label:
235
+
236
+ ```aconf
237
+ <source>
238
+ @type forward
239
+ @label @ES
240
+ </source>
241
+ <label @ES>
242
+ <match out.opensearch.**>
243
+ @type opensearch
244
+ host localhost
245
+ port 9200
246
+ type_name fluentd
247
+ logstash_format true
248
+ time_key @timestamp
249
+ include_timestamp true
250
+ reconnect_on_error true
251
+ reload_on_failure true
252
+ reload_connections false
253
+ request_timeout 120s
254
+ </match>
255
+ </label>
256
+ <label @ERROR>
257
+ <match **>
258
+ @type stdout
259
+ </match>
260
+ </label>
261
+ ```
262
+
263
+ ### Suggested to install typhoeus gem, why?
264
+
265
+ fluent-plugin-opensearch doesn't depend on typhoeus gem by default.
266
+ If you want to use typhoeus backend, you must install typhoeus gem by your own.
267
+
268
+ If you use vanilla Fluentd, you can install it by:
269
+
270
+ ```
271
+ gem install typhoeus
272
+ ```
273
+
274
+ But, you use td-agent instead of vanilla Fluentd, you have to use `td-agent-gem`:
275
+
276
+ ```
277
+ td-agent-gem install typhoeus
278
+ ```
279
+
280
+ In more detail, please refer to [the official plugin management document](https://docs.fluentd.org/v1.0/articles/plugin-management).
281
+
282
+ ### Random 400 - Rejected by OpenSearch is occured, why?
283
+
284
+ Index templates installed OpenSearch sometimes generates 400 - Rejected by OpenSearch errors.
285
+ For example, kubernetes audit log has structure:
286
+
287
+ ```json
288
+ "responseObject":{
289
+ "kind":"SubjectAccessReview",
290
+ "apiVersion":"authorization.k8s.io/v1beta1",
291
+ "metadata":{
292
+ "creationTimestamp":null
293
+ },
294
+ "spec":{
295
+ "nonResourceAttributes":{
296
+ "path":"/",
297
+ "verb":"get"
298
+ },
299
+ "user":"system:anonymous",
300
+ "group":[
301
+ "system:unauthenticated"
302
+ ]
303
+ },
304
+ "status":{
305
+ "allowed":true,
306
+ "reason":"RBAC: allowed by ClusterRoleBinding \"cluster-system-anonymous\" of ClusterRole \"cluster-admin\" to User \"system:anonymous\""
307
+ }
308
+ },
309
+ ```
310
+
311
+ The last element `status` sometimes becomes `"status":"Success"`.
312
+ This element type glich causes status 400 error.
313
+
314
+ There are some solutions for fixing this:
315
+
316
+ #### Solution 1
317
+
318
+ For a key which causes element type glich case.
319
+
320
+ Using dymanic mapping with the following template:
321
+
322
+ ```json
323
+ {
324
+ "template": "YOURINDEXNAME-*",
325
+ "mappings": {
326
+ "fluentd": {
327
+ "dynamic_templates": [
328
+ {
329
+ "default_no_index": {
330
+ "path_match": "^.*$",
331
+ "path_unmatch": "^(@timestamp|auditID|level|stage|requestURI|sourceIPs|metadata|objectRef|user|verb)(\\..+)?$",
332
+ "match_pattern": "regex",
333
+ "mapping": {
334
+ "index": false,
335
+ "enabled": false
336
+ }
337
+ }
338
+ }
339
+ ]
340
+ }
341
+ }
342
+ }
343
+ ```
344
+
345
+ Note that `YOURINDEXNAME` should be replaced with your using index prefix.
346
+
347
+ #### Solution 2
348
+
349
+ For unstable `responseObject` and `requestObject` key existence case.
350
+
351
+ ```aconf
352
+ <filter YOURROUTETAG>
353
+ @id kube_api_audit_normalize
354
+ @type record_transformer
355
+ auto_typecast false
356
+ enable_ruby true
357
+ <record>
358
+ host "#{ENV['K8S_NODE_NAME']}"
359
+ responseObject ${record["responseObject"].nil? ? "none": record["responseObject"].to_json}
360
+ requestObject ${record["requestObject"].nil? ? "none": record["requestObject"].to_json}
361
+ origin kubernetes-api-audit
362
+ </record>
363
+ </filter>
364
+ ```
365
+
366
+ Normalize `responseObject` and `requestObject` key with record_transformer and other similiar plugins is needed.
367
+
368
+ ### Fluentd seems to hang if it unable to connect OpenSearch, why?
369
+
370
+ On `#configure` phase, OpenSearch plugin should wait until OpenSearch instance communication is succeeded.
371
+ And OpenSearch plugin blocks to launch Fluentd by default.
372
+ Because Fluentd requests to set up configuration correctly on `#configure` phase.
373
+
374
+ After `#configure` phase, it runs very fast and send events heavily in some heavily using case.
375
+
376
+ In this scenario, we need to set up configuration correctly until `#configure` phase.
377
+ So, we provide default parameter is too conservative to use advanced users.
378
+
379
+ To remove too pessimistic behavior, you can use the following configuration:
380
+
381
+ ```aconf
382
+ <match **>
383
+ @type opensearch
384
+ # Some advanced users know their using OpenSearch version.
385
+ # We can disable startup OpenSearch version checking.
386
+ verify_os_version_at_startup false
387
+ # If you know that your using OpenSearch major version is 7, you can set as 7 here.
388
+ default_opensearch_version 1
389
+ # If using very stable OpenSearch cluster, you can reduce retry operation counts. (minmum is 1)
390
+ max_retry_get_os_version 1
391
+ # If using very stable OpenSearch cluster, you can reduce retry operation counts. (minmum is 1)
392
+ max_retry_putting_template 1
393
+ # ... and some OpenSearch plugin configuration
394
+ </match>
395
+ ```
396
+
397
+ ### How to specify index codec
398
+
399
+ OpenSearch can handle compression methods for stored data such as LZ4 and best_compression.
400
+ fluent-plugin-opensearch doesn't provide API which specifies compression method.
401
+
402
+ Users can specify stored data compression method with template:
403
+
404
+ Create `compression.json` as follows:
405
+
406
+ ```json
407
+ {
408
+ "order": 100,
409
+ "index_patterns": [
410
+ "YOUR-INDEX-PATTERN"
411
+ ],
412
+ "settings": {
413
+ "index": {
414
+ "codec": "best_compression"
415
+ }
416
+ }
417
+ }
418
+ ```
419
+
420
+ Then, specify the above template in your configuration:
421
+
422
+ ```aconf
423
+ template_name best_compression_tmpl
424
+ template_file compression.json
425
+ ```
426
+
427
+ OpenSearch will store data with `best_compression`:
428
+
429
+ ```
430
+ % curl -XGET 'http://localhost:9200/logstash-2019.12.06/_settings?pretty'
431
+ ```
432
+
433
+ ```json
434
+ {
435
+ "logstash-2019.12.06" : {
436
+ "settings" : {
437
+ "index" : {
438
+ "codec" : "best_compression",
439
+ "number_of_shards" : "1",
440
+ "provided_name" : "logstash-2019.12.06",
441
+ "creation_date" : "1575622843800",
442
+ "number_of_replicas" : "1",
443
+ "uuid" : "THE_AWESOMEUUID",
444
+ "version" : {
445
+ "created" : "7040100"
446
+ }
447
+ }
448
+ }
449
+ }
450
+ }
451
+ ```
452
+
453
+ ### Cannot push logs to OpenSearch with connect_write timeout reached, why?
454
+
455
+ It seems that OpenSearch cluster is exhausted.
456
+
457
+ Usually, Fluentd complains like the following log:
458
+
459
+ ```log
460
+ 2019-12-29 00:23:33 +0000 [warn]: buffer flush took longer time than slow_flush_log_threshold: elapsed_time=27.283766102716327 slow_flush_log_threshold=15.0 plugin_id="object:aaaffaaaaaff"
461
+ 2019-12-29 00:23:33 +0000 [warn]: buffer flush took longer time than slow_flush_log_threshold: elapsed_time=26.161768959928304 slow_flush_log_threshold=15.0 plugin_id="object:aaaffaaaaaff"
462
+ 2019-12-29 00:23:33 +0000 [warn]: buffer flush took longer time than slow_flush_log_threshold: elapsed_time=28.713624476008117 slow_flush_log_threshold=15.0 plugin_id="object:aaaffaaaaaff"
463
+ 2019-12-29 01:39:18 +0000 [warn]: Could not push logs to OpenSearch, resetting connection and trying again. connect_write timeout reached
464
+ 2019-12-29 01:39:18 +0000 [warn]: Could not push logs to OpenSearch, resetting connection and trying again. connect_write timeout reached
465
+ ```
466
+
467
+ This warnings is usually caused by exhaused OpenSearch cluster due to resource shortage.
468
+
469
+ If CPU usage is spiked and OpenSearch cluster is eating up CPU resource, this issue is caused by CPU resource shortage.
470
+
471
+ Check your OpenSearch cluster health status and resource usage.
472
+
473
+ ### Index State Management feature is not provided, why?
474
+
475
+ From OpenSearch documentation, Index Lifecycle Management (ILM) feature is renamed to Index State Management (ISM). And it is not recommended to use from logging agents.
476
+
477
+ Also, Ruby client library has a license issue for the original ILM part. To avoid this license issue, OpenSearch Ruby client library team decided to remove this part from their Ruby client code:
478
+ https://github.com/opensearch-project/opensearch-ruby/pull/4
479
+
480
+ Index State Management (ISM) is encouraged to use via OpenSearch Dashboards that is formerly known as Kibana.
481
+
482
+ See also: https://opensearch.org/docs/latest/im-plugin/ism/index/