fluent-plugin-elasticsearch 5.0.3 → 5.1.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.github/workflows/linux.yml +1 -1
- data/.github/workflows/macos.yml +1 -1
- data/.github/workflows/windows.yml +1 -1
- data/History.md +19 -0
- data/README.md +84 -2
- data/fluent-plugin-elasticsearch.gemspec +1 -1
- data/lib/fluent/plugin/elasticsearch_error_handler.rb +13 -2
- data/lib/fluent/plugin/elasticsearch_index_template.rb +13 -1
- data/lib/fluent/plugin/out_elasticsearch.rb +52 -4
- data/lib/fluent/plugin/out_elasticsearch_data_stream.rb +81 -49
- data/test/plugin/test_elasticsearch_error_handler.rb +25 -8
- data/test/plugin/test_elasticsearch_fallback_selector.rb +1 -1
- data/test/plugin/test_elasticsearch_index_lifecycle_management.rb +10 -0
- data/test/plugin/test_in_elasticsearch.rb +12 -0
- data/test/plugin/test_out_elasticsearch.rb +412 -18
- data/test/plugin/test_out_elasticsearch_data_stream.rb +348 -98
- data/test/plugin/test_out_elasticsearch_dynamic.rb +100 -5
- metadata +3 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 27d74e7048671def02b98e337c052c395152021a4a3f4c2138d1780c725d09bd
|
4
|
+
data.tar.gz: eb5282b8e688b091700a549c711af2f1959bc9b1b2c4f6fd1b49e3119c62ddb7
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 3a3ad9fa5259fcd1e80a85bdf7d1acd11cd26675d4f47f326100db242f0f9320232530099eb5346e5ea11aba76c4cc66cfc2e97f6393ca95d1227281217283ea
|
7
|
+
data.tar.gz: f97182a9487be71d34ddcd8ec2dd046eca9c6de1cae55d3842feeeaef627f23a05862be4783aee37fe38c84cba3bf984a51fb7fb3e8bad50f1bf0f57c956803e
|
data/.github/workflows/linux.yml
CHANGED
data/.github/workflows/macos.yml
CHANGED
data/History.md
CHANGED
@@ -1,6 +1,25 @@
|
|
1
1
|
## Changelog [[tags]](https://github.com/uken/fluent-plugin-elasticsearch/tags)
|
2
2
|
|
3
3
|
### [Unreleased]
|
4
|
+
|
5
|
+
### 5.1.1
|
6
|
+
- Report appropriate error for data_stream parameters (#922)
|
7
|
+
- Add ILM and template parameters for data streams (#920)
|
8
|
+
- Support Buffer in Data Stream Output (#917)
|
9
|
+
|
10
|
+
### 5.1.0
|
11
|
+
- Correct default target bytes value (#914)
|
12
|
+
- Handle elasticsearch-ruby 7.14 properly (#913)
|
13
|
+
|
14
|
+
### 5.0.5
|
15
|
+
- Drop json_parse_exception messages for bulk failures (#900)
|
16
|
+
- GitHub Actions: Drop Ruby 2.5 due to EOL (#894)
|
17
|
+
|
18
|
+
### 5.0.4
|
19
|
+
- test: out_elasticsearch: Remove a needless headers from affinity stub (#888)
|
20
|
+
- Target Index Affinity (#883)
|
21
|
+
|
22
|
+
### 5.0.3
|
4
23
|
- Fix use_legacy_template documentation (#880)
|
5
24
|
- Add FAQ for dynamic index/template (#878)
|
6
25
|
- Handle IPv6 address string on host and hosts parameters (#877)
|
data/README.md
CHANGED
@@ -11,7 +11,7 @@ Send your logs to Elasticsearch (and search them with Kibana maybe?)
|
|
11
11
|
|
12
12
|
Note: For Amazon Elasticsearch Service please consider using [fluent-plugin-aws-elasticsearch-service](https://github.com/atomita/fluent-plugin-aws-elasticsearch-service)
|
13
13
|
|
14
|
-
Current maintainers: @cosmo0920
|
14
|
+
Current maintainers: [Hiroshi Hatake | @cosmo0920](https://github.com/cosmo0920), [Kentaro Hayashi | @kenhys](https://github.com/kenhys)
|
15
15
|
|
16
16
|
* [Installation](#installation)
|
17
17
|
* [Usage](#usage)
|
@@ -38,6 +38,7 @@ Current maintainers: @cosmo0920
|
|
38
38
|
+ [suppress_type_name](#suppress_type_name)
|
39
39
|
+ [target_index_key](#target_index_key)
|
40
40
|
+ [target_type_key](#target_type_key)
|
41
|
+
+ [target_index_affinity](#target_index_affinity)
|
41
42
|
+ [template_name](#template_name)
|
42
43
|
+ [template_file](#template_file)
|
43
44
|
+ [template_overwrite](#template_overwrite)
|
@@ -454,6 +455,75 @@ and this record will be written to the specified index (`logstash-2014.12.19`) r
|
|
454
455
|
|
455
456
|
Similar to `target_index_key` config, find the type name to write to in the record under this key (or nested record). If key not found in record - fallback to `type_name` (default "fluentd").
|
456
457
|
|
458
|
+
### target_index_affinity
|
459
|
+
|
460
|
+
Enable plugin to dynamically select logstash time based target index in update/upsert operations based on already indexed data rather than current time of indexing.
|
461
|
+
|
462
|
+
```
|
463
|
+
target_index_affinity true # defaults to false
|
464
|
+
```
|
465
|
+
|
466
|
+
By default plugin writes data of logstash format index based on current time. For example daily based index after mignight data is written to newly created index. This is normally ok when data is coming from single source and not updated after indexing.
|
467
|
+
|
468
|
+
But if you have a use case where data is also updated after indexing and `id_key` is used to identify the document uniquely for updating. Logstash format is wanted to be used for easy data managing and retention. Updates are done right after indexing to complete the data (all data not available from single source) and no updates are done anymore later point on time. In this case problem happends at index rotation time where write to 2 indexes with same id_key value may happen.
|
469
|
+
|
470
|
+
This setting will search existing data by using elastic search's [id query](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-ids-query.html) using `id_key` value (with logstash_prefix and logstash_prefix_separator index pattarn e.g. `logstash-*`). The index of found data is used for update/upsert. When no data is found, data is written to current logstash index as normally.
|
471
|
+
|
472
|
+
This setting requires following other settings:
|
473
|
+
```
|
474
|
+
logstash_format true
|
475
|
+
id_key myId # Some field on your data to identify the data uniquely
|
476
|
+
write_operation upsert # upsert or update
|
477
|
+
```
|
478
|
+
|
479
|
+
Suppose you have the following situation where you have 2 different match to consume data from 2 different Kafka topics independently but close in time with each other (order not known).
|
480
|
+
|
481
|
+
```
|
482
|
+
<match data1>
|
483
|
+
@type elasticsearch
|
484
|
+
...
|
485
|
+
id_key myId
|
486
|
+
write_operation upsert
|
487
|
+
logstash_format true
|
488
|
+
logstash_dateformat %Y.%m.%d
|
489
|
+
logstash_prefix myindexprefix
|
490
|
+
target_index_affinity true
|
491
|
+
...
|
492
|
+
|
493
|
+
<match data2>
|
494
|
+
@type elasticsearch
|
495
|
+
...
|
496
|
+
id_key myId
|
497
|
+
write_operation upsert
|
498
|
+
logstash_format true
|
499
|
+
logstash_dateformat %Y.%m.%d
|
500
|
+
logstash_prefix myindexprefix
|
501
|
+
target_index_affinity true
|
502
|
+
...
|
503
|
+
```
|
504
|
+
|
505
|
+
If your first (data1) input is:
|
506
|
+
```
|
507
|
+
{
|
508
|
+
"myId": "myuniqueId1",
|
509
|
+
"datafield1": "some value",
|
510
|
+
}
|
511
|
+
```
|
512
|
+
|
513
|
+
and your second (data2) input is:
|
514
|
+
```
|
515
|
+
{
|
516
|
+
"myId": "myuniqueId1",
|
517
|
+
"datafield99": "some important data from other source tightly related to id myuniqueId1 and wanted to be in same document.",
|
518
|
+
}
|
519
|
+
```
|
520
|
+
|
521
|
+
Date today is 10.05.2021 so data is written to index `myindexprefix-2021.05.10` when both data1 and data2 is consumed during today.
|
522
|
+
But when we are close to index rotation and data1 is consumed and indexed at `2021-05-10T23:59:55.59707672Z` and data2
|
523
|
+
is consumed a bit later at `2021-05-11T00:00:58.222079Z` i.e. logstash index has been rotated and normally data2 would have been written
|
524
|
+
to index `myindexprefix-2021.05.11`. But with target_index_affinity setting as value true, data2 is now written to index `myindexprefix-2021.05.10`
|
525
|
+
into same document with data1 as wanted and duplicated document is avoided.
|
526
|
+
|
457
527
|
### template_name
|
458
528
|
|
459
529
|
The name of the template to define. If a template by the name given is already present, it will be left unchanged, unless [template_overwrite](#template_overwrite) is set, in which case the template will be updated.
|
@@ -1451,7 +1521,7 @@ You can enable this feature by specifying `@type elasticsearch_data_stream`.
|
|
1451
1521
|
data_stream_name test
|
1452
1522
|
```
|
1453
1523
|
|
1454
|
-
When `@type elasticsearch_data_stream` is used, ILM default policy is set to the specified data stream.
|
1524
|
+
When `@type elasticsearch_data_stream` is used, unless specified with `data_stream_ilm_name` and `data_stream_template_name`, ILM default policy is set to the specified data stream.
|
1455
1525
|
Then, the matching index template is also created automatically.
|
1456
1526
|
|
1457
1527
|
### data_stream_name
|
@@ -1459,6 +1529,18 @@ Then, the matching index template is also created automatically.
|
|
1459
1529
|
You can specify Elasticsearch data stream name by this parameter.
|
1460
1530
|
This parameter is mandatory for `elasticsearch_data_stream`.
|
1461
1531
|
|
1532
|
+
### data_stream_template_name
|
1533
|
+
|
1534
|
+
You can specify an existing matching index template for the data stream. If not present, it creates a new matching index template.
|
1535
|
+
|
1536
|
+
Default value is `data_stream_name`.
|
1537
|
+
|
1538
|
+
### data_stream_ilm_name
|
1539
|
+
|
1540
|
+
You can specify the name of an existing ILM policy, which will be applied to the data stream. If not present, it creates a new ILM default policy (unless `data_stream_template_name` is defined, in that case the ILM will be set to the one specified in the matching index template).
|
1541
|
+
|
1542
|
+
Default value is `data_stream_name`.
|
1543
|
+
|
1462
1544
|
There are some limitations about naming rule.
|
1463
1545
|
|
1464
1546
|
In more detail, please refer to the [Path parameters](https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-create-data-stream.html#indices-create-data-stream-api-path-params).
|
@@ -3,7 +3,7 @@ $:.push File.expand_path('../lib', __FILE__)
|
|
3
3
|
|
4
4
|
Gem::Specification.new do |s|
|
5
5
|
s.name = 'fluent-plugin-elasticsearch'
|
6
|
-
s.version = '5.
|
6
|
+
s.version = '5.1.1'
|
7
7
|
s.authors = ['diogo', 'pitr', 'Hiroshi Hatake']
|
8
8
|
s.email = ['pitr.vern@gmail.com', 'me@diogoterror.com', 'cosmo0920.wp@gmail.com']
|
9
9
|
s.description = %q{Elasticsearch output plugin for Fluent event collector}
|
@@ -23,6 +23,10 @@ class Fluent::Plugin::ElasticsearchErrorHandler
|
|
23
23
|
unrecoverable_error_types.include?(type)
|
24
24
|
end
|
25
25
|
|
26
|
+
def unrecoverable_record_error?(type)
|
27
|
+
['json_parse_exception'].include?(type)
|
28
|
+
end
|
29
|
+
|
26
30
|
def log_es_400_reason(&block)
|
27
31
|
if @plugin.log_es_400_reason
|
28
32
|
block.call
|
@@ -43,15 +47,17 @@ class Fluent::Plugin::ElasticsearchErrorHandler
|
|
43
47
|
stats = Hash.new(0)
|
44
48
|
meta = {}
|
45
49
|
header = {}
|
50
|
+
affinity_target_indices = @plugin.get_affinity_target_indices(chunk)
|
46
51
|
chunk.msgpack_each do |time, rawrecord|
|
47
52
|
bulk_message = ''
|
48
53
|
next unless rawrecord.is_a? Hash
|
49
54
|
begin
|
50
55
|
# we need a deep copy for process_message to alter
|
51
56
|
processrecord = Marshal.load(Marshal.dump(rawrecord))
|
52
|
-
meta, header, record = @plugin.process_message(tag, meta, header, time, processrecord, extracted_values)
|
57
|
+
meta, header, record = @plugin.process_message(tag, meta, header, time, processrecord, affinity_target_indices, extracted_values)
|
53
58
|
next unless @plugin.append_record_to_messages(@plugin.write_operation, meta, header, record, bulk_message)
|
54
59
|
rescue => e
|
60
|
+
@plugin.log.debug("Exception in error handler during deep copy: #{e}")
|
55
61
|
stats[:bad_chunk_record] += 1
|
56
62
|
next
|
57
63
|
end
|
@@ -105,10 +111,15 @@ class Fluent::Plugin::ElasticsearchErrorHandler
|
|
105
111
|
elsif item[write_operation].has_key?('error') && item[write_operation]['error'].has_key?('type')
|
106
112
|
type = item[write_operation]['error']['type']
|
107
113
|
stats[type] += 1
|
108
|
-
retry_stream.add(time, rawrecord)
|
109
114
|
if unrecoverable_error?(type)
|
110
115
|
raise ElasticsearchRequestAbortError, "Rejected Elasticsearch due to #{type}"
|
111
116
|
end
|
117
|
+
if unrecoverable_record_error?(type)
|
118
|
+
@plugin.router.emit_error_event(tag, time, rawrecord, ElasticsearchError.new("#{status} - #{type}: #{reason}"))
|
119
|
+
next
|
120
|
+
else
|
121
|
+
retry_stream.add(time, rawrecord) unless unrecoverable_record_error?(type)
|
122
|
+
end
|
112
123
|
else
|
113
124
|
# When we don't have a type field, something changed in the API
|
114
125
|
# expected return values (ES 2.x)
|
@@ -32,13 +32,25 @@ module Fluent::ElasticsearchIndexTemplate
|
|
32
32
|
return false
|
33
33
|
end
|
34
34
|
|
35
|
+
def host_unreachable_exceptions
|
36
|
+
if Gem::Version.new(::Elasticsearch::Transport::VERSION) >= Gem::Version.new("7.14.0")
|
37
|
+
# elasticsearch-ruby 7.14.0's elasticsearch-transport does not extends
|
38
|
+
# Elasticsearch class on Transport.
|
39
|
+
# This is why #host_unreachable_exceptions is not callable directly
|
40
|
+
# via transport (not transport's transport instance accessor) any more.
|
41
|
+
client.transport.transport.host_unreachable_exceptions
|
42
|
+
else
|
43
|
+
client.transport.host_unreachable_exceptions
|
44
|
+
end
|
45
|
+
end
|
46
|
+
|
35
47
|
def retry_operate(max_retries, fail_on_retry_exceed = true, catch_trasport_exceptions = true)
|
36
48
|
return unless block_given?
|
37
49
|
retries = 0
|
38
50
|
transport_errors = Elasticsearch::Transport::Transport::Errors.constants.map{ |c| Elasticsearch::Transport::Transport::Errors.const_get c } if catch_trasport_exceptions
|
39
51
|
begin
|
40
52
|
yield
|
41
|
-
rescue *
|
53
|
+
rescue *host_unreachable_exceptions, *transport_errors, Timeout::Error => e
|
42
54
|
@_es = nil
|
43
55
|
@_es_info = nil
|
44
56
|
if retries < max_retries
|
@@ -2,6 +2,7 @@
|
|
2
2
|
require 'date'
|
3
3
|
require 'excon'
|
4
4
|
require 'elasticsearch'
|
5
|
+
require 'set'
|
5
6
|
begin
|
6
7
|
require 'elasticsearch/xpack'
|
7
8
|
rescue LoadError
|
@@ -71,7 +72,7 @@ module Fluent::Plugin
|
|
71
72
|
DEFAULT_TYPE_NAME_ES_7x = "_doc".freeze
|
72
73
|
DEFAULT_TYPE_NAME = "fluentd".freeze
|
73
74
|
DEFAULT_RELOAD_AFTER = -1
|
74
|
-
|
75
|
+
DEFAULT_TARGET_BULK_BYTES = -1
|
75
76
|
DEFAULT_POLICY_ID = "logstash-policy"
|
76
77
|
|
77
78
|
config_param :host, :string, :default => 'localhost'
|
@@ -165,7 +166,7 @@ EOC
|
|
165
166
|
config_param :suppress_doc_wrap, :bool, :default => false
|
166
167
|
config_param :ignore_exceptions, :array, :default => [], value_type: :string, :desc => "Ignorable exception list"
|
167
168
|
config_param :exception_backup, :bool, :default => true, :desc => "Chunk backup flag when ignore exception occured"
|
168
|
-
config_param :bulk_message_request_threshold, :size, :default =>
|
169
|
+
config_param :bulk_message_request_threshold, :size, :default => DEFAULT_TARGET_BULK_BYTES
|
169
170
|
config_param :compression_level, :enum, list: [:no_compression, :best_speed, :best_compression, :default_compression], :default => :no_compression
|
170
171
|
config_param :enable_ilm, :bool, :default => false
|
171
172
|
config_param :ilm_policy_id, :string, :default => DEFAULT_POLICY_ID
|
@@ -175,6 +176,7 @@ EOC
|
|
175
176
|
config_param :truncate_caches_interval, :time, :default => nil
|
176
177
|
config_param :use_legacy_template, :bool, :default => true
|
177
178
|
config_param :catch_transport_exception_on_retry, :bool, :default => true
|
179
|
+
config_param :target_index_affinity, :bool, :default => false
|
178
180
|
|
179
181
|
config_section :metadata, param_name: :metainfo, multi: false do
|
180
182
|
config_param :include_chunk_id, :bool, :default => false
|
@@ -834,13 +836,14 @@ EOC
|
|
834
836
|
extract_placeholders(@host, chunk)
|
835
837
|
end
|
836
838
|
|
839
|
+
affinity_target_indices = get_affinity_target_indices(chunk)
|
837
840
|
chunk.msgpack_each do |time, record|
|
838
841
|
next unless record.is_a? Hash
|
839
842
|
|
840
843
|
record = inject_chunk_id_to_record_if_needed(record, chunk_id)
|
841
844
|
|
842
845
|
begin
|
843
|
-
meta, header, record = process_message(tag, meta, header, time, record, extracted_values)
|
846
|
+
meta, header, record = process_message(tag, meta, header, time, record, affinity_target_indices, extracted_values)
|
844
847
|
info = if @include_index_in_url
|
845
848
|
RequestInfo.new(host, meta.delete("_index".freeze), meta["_index".freeze], meta.delete("_alias".freeze))
|
846
849
|
else
|
@@ -877,6 +880,42 @@ EOC
|
|
877
880
|
end
|
878
881
|
end
|
879
882
|
|
883
|
+
def target_index_affinity_enabled?()
|
884
|
+
@target_index_affinity && @logstash_format && @id_key && (@write_operation == UPDATE_OP || @write_operation == UPSERT_OP)
|
885
|
+
end
|
886
|
+
|
887
|
+
def get_affinity_target_indices(chunk)
|
888
|
+
indices = Hash.new
|
889
|
+
if target_index_affinity_enabled?()
|
890
|
+
id_key_accessor = record_accessor_create(@id_key)
|
891
|
+
ids = Set.new
|
892
|
+
chunk.msgpack_each do |time, record|
|
893
|
+
next unless record.is_a? Hash
|
894
|
+
begin
|
895
|
+
ids << id_key_accessor.call(record)
|
896
|
+
end
|
897
|
+
end
|
898
|
+
log.debug("Find affinity target_indices by quering on ES (write_operation #{@write_operation}) for ids: #{ids.to_a}")
|
899
|
+
options = {
|
900
|
+
:index => "#{logstash_prefix}#{@logstash_prefix_separator}*",
|
901
|
+
}
|
902
|
+
query = {
|
903
|
+
'query' => { 'ids' => { 'values' => ids.to_a } },
|
904
|
+
'_source' => false,
|
905
|
+
'sort' => [
|
906
|
+
{"_index" => {"order" => "desc"}}
|
907
|
+
]
|
908
|
+
}
|
909
|
+
result = client.search(options.merge(:body => Yajl.dump(query)))
|
910
|
+
# There should be just one hit per _id, but in case there still is multiple, just the oldest index is stored to map
|
911
|
+
result['hits']['hits'].each do |hit|
|
912
|
+
indices[hit["_id"]] = hit["_index"]
|
913
|
+
log.debug("target_index for id: #{hit["_id"]} from es: #{hit["_index"]}")
|
914
|
+
end
|
915
|
+
end
|
916
|
+
indices
|
917
|
+
end
|
918
|
+
|
880
919
|
def split_request?(bulk_message, info)
|
881
920
|
# For safety.
|
882
921
|
end
|
@@ -889,7 +928,7 @@ EOC
|
|
889
928
|
false
|
890
929
|
end
|
891
930
|
|
892
|
-
def process_message(tag, meta, header, time, record, extracted_values)
|
931
|
+
def process_message(tag, meta, header, time, record, affinity_target_indices, extracted_values)
|
893
932
|
logstash_prefix, logstash_dateformat, index_name, type_name, _template_name, _customize_template, _deflector_alias, application_name, pipeline, _ilm_policy_id = extracted_values
|
894
933
|
|
895
934
|
if @flatten_hashes
|
@@ -930,6 +969,15 @@ EOC
|
|
930
969
|
record[@tag_key] = tag
|
931
970
|
end
|
932
971
|
|
972
|
+
# If affinity target indices map has value for this particular id, use it as target_index
|
973
|
+
if !affinity_target_indices.empty?
|
974
|
+
id_accessor = record_accessor_create(@id_key)
|
975
|
+
id_value = id_accessor.call(record)
|
976
|
+
if affinity_target_indices.key?(id_value)
|
977
|
+
target_index = affinity_target_indices[id_value]
|
978
|
+
end
|
979
|
+
end
|
980
|
+
|
933
981
|
target_type_parent, target_type_child_key = @target_type_key ? get_parent_of(record, @target_type_key) : nil
|
934
982
|
if target_type_parent && target_type_parent[target_type_child_key]
|
935
983
|
target_type = target_type_parent.delete(target_type_child_key)
|
@@ -1,3 +1,4 @@
|
|
1
|
+
|
1
2
|
require_relative 'out_elasticsearch'
|
2
3
|
|
3
4
|
module Fluent::Plugin
|
@@ -8,6 +9,8 @@ module Fluent::Plugin
|
|
8
9
|
helpers :event_emitter
|
9
10
|
|
10
11
|
config_param :data_stream_name, :string
|
12
|
+
config_param :data_stream_ilm_name, :string, :default => :data_stream_name
|
13
|
+
config_param :data_stream_template_name, :string, :default => :data_stream_name
|
11
14
|
# Elasticsearch 7.9 or later always support new style of index template.
|
12
15
|
config_set_default :use_legacy_template, false
|
13
16
|
|
@@ -26,7 +29,7 @@ module Fluent::Plugin
|
|
26
29
|
|
27
30
|
# ref. https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-create-data-stream.html
|
28
31
|
unless placeholder?(:data_stream_name_placeholder, @data_stream_name)
|
29
|
-
|
32
|
+
validate_data_stream_parameters
|
30
33
|
else
|
31
34
|
@use_placeholder = true
|
32
35
|
@data_stream_names = []
|
@@ -36,8 +39,8 @@ module Fluent::Plugin
|
|
36
39
|
unless @use_placeholder
|
37
40
|
begin
|
38
41
|
@data_stream_names = [@data_stream_name]
|
39
|
-
create_ilm_policy(@data_stream_name)
|
40
|
-
create_index_template(@data_stream_name)
|
42
|
+
create_ilm_policy(@data_stream_name, @data_stream_template_name, @data_stream_ilm_name, @host)
|
43
|
+
create_index_template(@data_stream_name, @data_stream_template_name, @data_stream_ilm_name, @host)
|
41
44
|
create_data_stream(@data_stream_name)
|
42
45
|
rescue => e
|
43
46
|
raise Fluent::ConfigError, "Failed to create data stream: <#{@data_stream_name}> #{e.message}"
|
@@ -45,31 +48,35 @@ module Fluent::Plugin
|
|
45
48
|
end
|
46
49
|
end
|
47
50
|
|
48
|
-
def
|
49
|
-
|
50
|
-
|
51
|
-
|
52
|
-
|
53
|
-
|
54
|
-
|
51
|
+
def validate_data_stream_parameters
|
52
|
+
{"data_stream_name" => @data_stream_name,
|
53
|
+
"data_stream_template_name"=> @data_stream_template_name,
|
54
|
+
"data_stream_ilm_name" => @data_stream_ilm_name}.each do |parameter, value|
|
55
|
+
unless valid_data_stream_parameters?(value)
|
56
|
+
unless start_with_valid_characters?(value)
|
57
|
+
if not_dots?(value)
|
58
|
+
raise Fluent::ConfigError, "'#{parameter}' must not start with #{INVALID_START_CHRACTERS.join(",")}: <#{value}>"
|
59
|
+
else
|
60
|
+
raise Fluent::ConfigError, "'#{parameter}' must not be . or ..: <#{value}>"
|
61
|
+
end
|
62
|
+
end
|
63
|
+
unless valid_characters?(value)
|
64
|
+
raise Fluent::ConfigError, "'#{parameter}' must not contain invalid characters #{INVALID_CHARACTERS.join(",")}: <#{value}>"
|
65
|
+
end
|
66
|
+
unless lowercase_only?(value)
|
67
|
+
raise Fluent::ConfigError, "'#{parameter}' must be lowercase only: <#{value}>"
|
68
|
+
end
|
69
|
+
if value.bytes.size > 255
|
70
|
+
raise Fluent::ConfigError, "'#{parameter}' must not be longer than 255 bytes: <#{value}>"
|
55
71
|
end
|
56
|
-
end
|
57
|
-
unless valid_characters?
|
58
|
-
raise Fluent::ConfigError, "'data_stream_name' must not contain invalid characters #{INVALID_CHARACTERS.join(",")}: <#{@data_stream_name}>"
|
59
|
-
end
|
60
|
-
unless lowercase_only?
|
61
|
-
raise Fluent::ConfigError, "'data_stream_name' must be lowercase only: <#{@data_stream_name}>"
|
62
|
-
end
|
63
|
-
if @data_stream_name.bytes.size > 255
|
64
|
-
raise Fluent::ConfigError, "'data_stream_name' must not be longer than 255 bytes: <#{@data_stream_name}>"
|
65
72
|
end
|
66
73
|
end
|
67
74
|
end
|
68
75
|
|
69
|
-
def create_ilm_policy(
|
70
|
-
return if data_stream_exist?(
|
76
|
+
def create_ilm_policy(datastream_name, template_name, ilm_name, host)
|
77
|
+
return if data_stream_exist?(datastream_name) or template_exists?(template_name, host) or ilm_policy_exists?(ilm_name)
|
71
78
|
params = {
|
72
|
-
policy_id: "#{
|
79
|
+
policy_id: "#{ilm_name}_policy",
|
73
80
|
body: File.read(File.join(File.dirname(__FILE__), "default-ilm-policy.json"))
|
74
81
|
}
|
75
82
|
retry_operate(@max_retry_putting_template,
|
@@ -79,19 +86,19 @@ module Fluent::Plugin
|
|
79
86
|
end
|
80
87
|
end
|
81
88
|
|
82
|
-
def create_index_template(
|
83
|
-
return if data_stream_exist?(
|
89
|
+
def create_index_template(datastream_name, template_name, ilm_name, host)
|
90
|
+
return if data_stream_exist?(datastream_name) or template_exists?(template_name, host)
|
84
91
|
body = {
|
85
|
-
"index_patterns" => ["#{
|
92
|
+
"index_patterns" => ["#{datastream_name}*"],
|
86
93
|
"data_stream" => {},
|
87
94
|
"template" => {
|
88
95
|
"settings" => {
|
89
|
-
"index.lifecycle.name" => "#{
|
96
|
+
"index.lifecycle.name" => "#{ilm_name}_policy"
|
90
97
|
}
|
91
98
|
}
|
92
99
|
}
|
93
100
|
params = {
|
94
|
-
name:
|
101
|
+
name: template_name,
|
95
102
|
body: body
|
96
103
|
}
|
97
104
|
retry_operate(@max_retry_putting_template,
|
@@ -101,9 +108,9 @@ module Fluent::Plugin
|
|
101
108
|
end
|
102
109
|
end
|
103
110
|
|
104
|
-
def data_stream_exist?(
|
111
|
+
def data_stream_exist?(datastream_name)
|
105
112
|
params = {
|
106
|
-
|
113
|
+
name: datastream_name
|
107
114
|
}
|
108
115
|
begin
|
109
116
|
response = @client.indices.get_data_stream(params)
|
@@ -114,10 +121,10 @@ module Fluent::Plugin
|
|
114
121
|
end
|
115
122
|
end
|
116
123
|
|
117
|
-
def create_data_stream(
|
118
|
-
return if data_stream_exist?(
|
124
|
+
def create_data_stream(datastream_name)
|
125
|
+
return if data_stream_exist?(datastream_name)
|
119
126
|
params = {
|
120
|
-
|
127
|
+
name: datastream_name
|
121
128
|
}
|
122
129
|
retry_operate(@max_retry_putting_template,
|
123
130
|
@fail_on_putting_template_retry_exceed,
|
@@ -126,28 +133,48 @@ module Fluent::Plugin
|
|
126
133
|
end
|
127
134
|
end
|
128
135
|
|
129
|
-
def
|
130
|
-
|
131
|
-
|
132
|
-
|
133
|
-
|
134
|
-
|
136
|
+
def ilm_policy_exists?(policy_id)
|
137
|
+
begin
|
138
|
+
@client.ilm.get_policy(policy_id: policy_id)
|
139
|
+
true
|
140
|
+
rescue
|
141
|
+
false
|
142
|
+
end
|
143
|
+
end
|
144
|
+
|
145
|
+
def template_exists?(name, host = nil)
|
146
|
+
if @use_legacy_template
|
147
|
+
client(host).indices.get_template(:name => name)
|
148
|
+
else
|
149
|
+
client(host).indices.get_index_template(:name => name)
|
150
|
+
end
|
151
|
+
return true
|
152
|
+
rescue Elasticsearch::Transport::Transport::Errors::NotFound
|
153
|
+
return false
|
154
|
+
end
|
155
|
+
|
156
|
+
def valid_data_stream_parameters?(data_stream_parameter)
|
157
|
+
lowercase_only?(data_stream_parameter) and
|
158
|
+
valid_characters?(data_stream_parameter) and
|
159
|
+
start_with_valid_characters?(data_stream_parameter) and
|
160
|
+
not_dots?(data_stream_parameter) and
|
161
|
+
data_stream_parameter.bytes.size <= 255
|
135
162
|
end
|
136
163
|
|
137
|
-
def lowercase_only?
|
138
|
-
|
164
|
+
def lowercase_only?(data_stream_parameter)
|
165
|
+
data_stream_parameter.downcase == data_stream_parameter
|
139
166
|
end
|
140
167
|
|
141
|
-
def valid_characters?
|
142
|
-
not (INVALID_CHARACTERS.each.any? do |v|
|
168
|
+
def valid_characters?(data_stream_parameter)
|
169
|
+
not (INVALID_CHARACTERS.each.any? do |v| data_stream_parameter.include?(v) end)
|
143
170
|
end
|
144
171
|
|
145
|
-
def start_with_valid_characters?
|
146
|
-
not (INVALID_START_CHRACTERS.each.any? do |v|
|
172
|
+
def start_with_valid_characters?(data_stream_parameter)
|
173
|
+
not (INVALID_START_CHRACTERS.each.any? do |v| data_stream_parameter.start_with?(v) end)
|
147
174
|
end
|
148
175
|
|
149
|
-
def not_dots?
|
150
|
-
not (
|
176
|
+
def not_dots?(data_stream_parameter)
|
177
|
+
not (data_stream_parameter == "." or data_stream_parameter == "..")
|
151
178
|
end
|
152
179
|
|
153
180
|
def client_library_version
|
@@ -160,13 +187,18 @@ module Fluent::Plugin
|
|
160
187
|
|
161
188
|
def write(chunk)
|
162
189
|
data_stream_name = @data_stream_name
|
190
|
+
data_stream_template_name = @data_stream_template_name
|
191
|
+
data_stream_ilm_name = @data_stream_ilm_name
|
192
|
+
host = @host
|
163
193
|
if @use_placeholder
|
164
194
|
data_stream_name = extract_placeholders(@data_stream_name, chunk)
|
195
|
+
data_stream_template_name = extract_placeholders(@data_stream_template_name, chunk)
|
196
|
+
data_stream_ilm_name = extract_placeholders(@data_stream_ilm_name, chunk)
|
165
197
|
unless @data_stream_names.include?(data_stream_name)
|
166
198
|
begin
|
167
|
-
create_ilm_policy(data_stream_name)
|
168
|
-
create_index_template(data_stream_name)
|
169
199
|
create_data_stream(data_stream_name)
|
200
|
+
create_ilm_policy(data_stream_name, data_stream_template_name, data_stream_ilm_name, host)
|
201
|
+
create_index_template(data_stream_name, data_stream_template_name, data_stream_ilm_name, host)
|
170
202
|
@data_stream_names << data_stream_name
|
171
203
|
rescue => e
|
172
204
|
raise Fluent::ConfigError, "Failed to create data stream: <#{data_stream_name}> #{e.message}"
|
@@ -200,7 +232,7 @@ module Fluent::Plugin
|
|
200
232
|
log.error "Could not bulk insert to Data Stream: #{data_stream_name} #{response}"
|
201
233
|
end
|
202
234
|
rescue => e
|
203
|
-
|
235
|
+
raise RecoverableRequestFailure, "could not push logs to Elasticsearch cluster (#{data_stream_name}): #{e.message}"
|
204
236
|
end
|
205
237
|
end
|
206
238
|
|