fluent-plugin-elasticsearch 5.0.3 → 5.1.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 6eb418d889b91bf79c37c1cd72789981eb4133bcfc34e21b26e79bb919462272
4
- data.tar.gz: 168ddc77fb73216da63f1ce65ed673cf12ab13d6db549fc3e2da81956c938990
3
+ metadata.gz: 27d74e7048671def02b98e337c052c395152021a4a3f4c2138d1780c725d09bd
4
+ data.tar.gz: eb5282b8e688b091700a549c711af2f1959bc9b1b2c4f6fd1b49e3119c62ddb7
5
5
  SHA512:
6
- metadata.gz: 5c4a2a8f63b25ea8785e0d4ef291043a93a86fea8f42638caaf46dd3efe29fd0a12919dd339d8a28c97aef65494ba054a3ff3312a122ab2cd89cb54cb069055c
7
- data.tar.gz: 6b13851ce29b2a6f2083ebc57ae733a9ca00d79a9fd09ac4c7630d99a9c6d2da302c48f2e0cd901810e0a851911e14f8355fc336e21afd97d3f33fe205b0c78a
6
+ metadata.gz: 3a3ad9fa5259fcd1e80a85bdf7d1acd11cd26675d4f47f326100db242f0f9320232530099eb5346e5ea11aba76c4cc66cfc2e97f6393ca95d1227281217283ea
7
+ data.tar.gz: f97182a9487be71d34ddcd8ec2dd046eca9c6de1cae55d3842feeeaef627f23a05862be4783aee37fe38c84cba3bf984a51fb7fb3e8bad50f1bf0f57c956803e
@@ -8,7 +8,7 @@ jobs:
8
8
  strategy:
9
9
  fail-fast: false
10
10
  matrix:
11
- ruby: [ '2.5', '2.6', '2.7', '3.0' ]
11
+ ruby: [ '2.6', '2.7', '3.0' ]
12
12
  os:
13
13
  - ubuntu-latest
14
14
  name: Ruby ${{ matrix.ruby }} unit testing on ${{ matrix.os }}
@@ -8,7 +8,7 @@ jobs:
8
8
  strategy:
9
9
  fail-fast: false
10
10
  matrix:
11
- ruby: [ '2.5', '2.6', '2.7', '3.0' ]
11
+ ruby: [ '2.6', '2.7', '3.0' ]
12
12
  os:
13
13
  - macOS-latest
14
14
  name: Ruby ${{ matrix.ruby }} unit testing on ${{ matrix.os }}
@@ -8,7 +8,7 @@ jobs:
8
8
  strategy:
9
9
  fail-fast: false
10
10
  matrix:
11
- ruby: [ '2.5', '2.6', '2.7', '3.0' ]
11
+ ruby: [ '2.6', '2.7', '3.0' ]
12
12
  os:
13
13
  - windows-latest
14
14
  name: Ruby ${{ matrix.ruby }} unit testing on ${{ matrix.os }}
data/History.md CHANGED
@@ -1,6 +1,25 @@
1
1
  ## Changelog [[tags]](https://github.com/uken/fluent-plugin-elasticsearch/tags)
2
2
 
3
3
  ### [Unreleased]
4
+
5
+ ### 5.1.1
6
+ - Report appropriate error for data_stream parameters (#922)
7
+ - Add ILM and template parameters for data streams (#920)
8
+ - Support Buffer in Data Stream Output (#917)
9
+
10
+ ### 5.1.0
11
+ - Correct default target bytes value (#914)
12
+ - Handle elasticsearch-ruby 7.14 properly (#913)
13
+
14
+ ### 5.0.5
15
+ - Drop json_parse_exception messages for bulk failures (#900)
16
+ - GitHub Actions: Drop Ruby 2.5 due to EOL (#894)
17
+
18
+ ### 5.0.4
19
+ - test: out_elasticsearch: Remove a needless headers from affinity stub (#888)
20
+ - Target Index Affinity (#883)
21
+
22
+ ### 5.0.3
4
23
  - Fix use_legacy_template documentation (#880)
5
24
  - Add FAQ for dynamic index/template (#878)
6
25
  - Handle IPv6 address string on host and hosts parameters (#877)
data/README.md CHANGED
@@ -11,7 +11,7 @@ Send your logs to Elasticsearch (and search them with Kibana maybe?)
11
11
 
12
12
  Note: For Amazon Elasticsearch Service please consider using [fluent-plugin-aws-elasticsearch-service](https://github.com/atomita/fluent-plugin-aws-elasticsearch-service)
13
13
 
14
- Current maintainers: @cosmo0920
14
+ Current maintainers: [Hiroshi Hatake | @cosmo0920](https://github.com/cosmo0920), [Kentaro Hayashi | @kenhys](https://github.com/kenhys)
15
15
 
16
16
  * [Installation](#installation)
17
17
  * [Usage](#usage)
@@ -38,6 +38,7 @@ Current maintainers: @cosmo0920
38
38
  + [suppress_type_name](#suppress_type_name)
39
39
  + [target_index_key](#target_index_key)
40
40
  + [target_type_key](#target_type_key)
41
+ + [target_index_affinity](#target_index_affinity)
41
42
  + [template_name](#template_name)
42
43
  + [template_file](#template_file)
43
44
  + [template_overwrite](#template_overwrite)
@@ -454,6 +455,75 @@ and this record will be written to the specified index (`logstash-2014.12.19`) r
454
455
 
455
456
  Similar to `target_index_key` config, find the type name to write to in the record under this key (or nested record). If key not found in record - fallback to `type_name` (default "fluentd").
456
457
 
458
+ ### target_index_affinity
459
+
460
+ Enable plugin to dynamically select logstash time based target index in update/upsert operations based on already indexed data rather than current time of indexing.
461
+
462
+ ```
463
+ target_index_affinity true # defaults to false
464
+ ```
465
+
466
+ By default plugin writes data of logstash format index based on current time. For example daily based index after mignight data is written to newly created index. This is normally ok when data is coming from single source and not updated after indexing.
467
+
468
+ But if you have a use case where data is also updated after indexing and `id_key` is used to identify the document uniquely for updating. Logstash format is wanted to be used for easy data managing and retention. Updates are done right after indexing to complete the data (all data not available from single source) and no updates are done anymore later point on time. In this case problem happends at index rotation time where write to 2 indexes with same id_key value may happen.
469
+
470
+ This setting will search existing data by using elastic search's [id query](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-ids-query.html) using `id_key` value (with logstash_prefix and logstash_prefix_separator index pattarn e.g. `logstash-*`). The index of found data is used for update/upsert. When no data is found, data is written to current logstash index as normally.
471
+
472
+ This setting requires following other settings:
473
+ ```
474
+ logstash_format true
475
+ id_key myId # Some field on your data to identify the data uniquely
476
+ write_operation upsert # upsert or update
477
+ ```
478
+
479
+ Suppose you have the following situation where you have 2 different match to consume data from 2 different Kafka topics independently but close in time with each other (order not known).
480
+
481
+ ```
482
+ <match data1>
483
+ @type elasticsearch
484
+ ...
485
+ id_key myId
486
+ write_operation upsert
487
+ logstash_format true
488
+ logstash_dateformat %Y.%m.%d
489
+ logstash_prefix myindexprefix
490
+ target_index_affinity true
491
+ ...
492
+
493
+ <match data2>
494
+ @type elasticsearch
495
+ ...
496
+ id_key myId
497
+ write_operation upsert
498
+ logstash_format true
499
+ logstash_dateformat %Y.%m.%d
500
+ logstash_prefix myindexprefix
501
+ target_index_affinity true
502
+ ...
503
+ ```
504
+
505
+ If your first (data1) input is:
506
+ ```
507
+ {
508
+ "myId": "myuniqueId1",
509
+ "datafield1": "some value",
510
+ }
511
+ ```
512
+
513
+ and your second (data2) input is:
514
+ ```
515
+ {
516
+ "myId": "myuniqueId1",
517
+ "datafield99": "some important data from other source tightly related to id myuniqueId1 and wanted to be in same document.",
518
+ }
519
+ ```
520
+
521
+ Date today is 10.05.2021 so data is written to index `myindexprefix-2021.05.10` when both data1 and data2 is consumed during today.
522
+ But when we are close to index rotation and data1 is consumed and indexed at `2021-05-10T23:59:55.59707672Z` and data2
523
+ is consumed a bit later at `2021-05-11T00:00:58.222079Z` i.e. logstash index has been rotated and normally data2 would have been written
524
+ to index `myindexprefix-2021.05.11`. But with target_index_affinity setting as value true, data2 is now written to index `myindexprefix-2021.05.10`
525
+ into same document with data1 as wanted and duplicated document is avoided.
526
+
457
527
  ### template_name
458
528
 
459
529
  The name of the template to define. If a template by the name given is already present, it will be left unchanged, unless [template_overwrite](#template_overwrite) is set, in which case the template will be updated.
@@ -1451,7 +1521,7 @@ You can enable this feature by specifying `@type elasticsearch_data_stream`.
1451
1521
  data_stream_name test
1452
1522
  ```
1453
1523
 
1454
- When `@type elasticsearch_data_stream` is used, ILM default policy is set to the specified data stream.
1524
+ When `@type elasticsearch_data_stream` is used, unless specified with `data_stream_ilm_name` and `data_stream_template_name`, ILM default policy is set to the specified data stream.
1455
1525
  Then, the matching index template is also created automatically.
1456
1526
 
1457
1527
  ### data_stream_name
@@ -1459,6 +1529,18 @@ Then, the matching index template is also created automatically.
1459
1529
  You can specify Elasticsearch data stream name by this parameter.
1460
1530
  This parameter is mandatory for `elasticsearch_data_stream`.
1461
1531
 
1532
+ ### data_stream_template_name
1533
+
1534
+ You can specify an existing matching index template for the data stream. If not present, it creates a new matching index template.
1535
+
1536
+ Default value is `data_stream_name`.
1537
+
1538
+ ### data_stream_ilm_name
1539
+
1540
+ You can specify the name of an existing ILM policy, which will be applied to the data stream. If not present, it creates a new ILM default policy (unless `data_stream_template_name` is defined, in that case the ILM will be set to the one specified in the matching index template).
1541
+
1542
+ Default value is `data_stream_name`.
1543
+
1462
1544
  There are some limitations about naming rule.
1463
1545
 
1464
1546
  In more detail, please refer to the [Path parameters](https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-create-data-stream.html#indices-create-data-stream-api-path-params).
@@ -3,7 +3,7 @@ $:.push File.expand_path('../lib', __FILE__)
3
3
 
4
4
  Gem::Specification.new do |s|
5
5
  s.name = 'fluent-plugin-elasticsearch'
6
- s.version = '5.0.3'
6
+ s.version = '5.1.1'
7
7
  s.authors = ['diogo', 'pitr', 'Hiroshi Hatake']
8
8
  s.email = ['pitr.vern@gmail.com', 'me@diogoterror.com', 'cosmo0920.wp@gmail.com']
9
9
  s.description = %q{Elasticsearch output plugin for Fluent event collector}
@@ -23,6 +23,10 @@ class Fluent::Plugin::ElasticsearchErrorHandler
23
23
  unrecoverable_error_types.include?(type)
24
24
  end
25
25
 
26
+ def unrecoverable_record_error?(type)
27
+ ['json_parse_exception'].include?(type)
28
+ end
29
+
26
30
  def log_es_400_reason(&block)
27
31
  if @plugin.log_es_400_reason
28
32
  block.call
@@ -43,15 +47,17 @@ class Fluent::Plugin::ElasticsearchErrorHandler
43
47
  stats = Hash.new(0)
44
48
  meta = {}
45
49
  header = {}
50
+ affinity_target_indices = @plugin.get_affinity_target_indices(chunk)
46
51
  chunk.msgpack_each do |time, rawrecord|
47
52
  bulk_message = ''
48
53
  next unless rawrecord.is_a? Hash
49
54
  begin
50
55
  # we need a deep copy for process_message to alter
51
56
  processrecord = Marshal.load(Marshal.dump(rawrecord))
52
- meta, header, record = @plugin.process_message(tag, meta, header, time, processrecord, extracted_values)
57
+ meta, header, record = @plugin.process_message(tag, meta, header, time, processrecord, affinity_target_indices, extracted_values)
53
58
  next unless @plugin.append_record_to_messages(@plugin.write_operation, meta, header, record, bulk_message)
54
59
  rescue => e
60
+ @plugin.log.debug("Exception in error handler during deep copy: #{e}")
55
61
  stats[:bad_chunk_record] += 1
56
62
  next
57
63
  end
@@ -105,10 +111,15 @@ class Fluent::Plugin::ElasticsearchErrorHandler
105
111
  elsif item[write_operation].has_key?('error') && item[write_operation]['error'].has_key?('type')
106
112
  type = item[write_operation]['error']['type']
107
113
  stats[type] += 1
108
- retry_stream.add(time, rawrecord)
109
114
  if unrecoverable_error?(type)
110
115
  raise ElasticsearchRequestAbortError, "Rejected Elasticsearch due to #{type}"
111
116
  end
117
+ if unrecoverable_record_error?(type)
118
+ @plugin.router.emit_error_event(tag, time, rawrecord, ElasticsearchError.new("#{status} - #{type}: #{reason}"))
119
+ next
120
+ else
121
+ retry_stream.add(time, rawrecord) unless unrecoverable_record_error?(type)
122
+ end
112
123
  else
113
124
  # When we don't have a type field, something changed in the API
114
125
  # expected return values (ES 2.x)
@@ -32,13 +32,25 @@ module Fluent::ElasticsearchIndexTemplate
32
32
  return false
33
33
  end
34
34
 
35
+ def host_unreachable_exceptions
36
+ if Gem::Version.new(::Elasticsearch::Transport::VERSION) >= Gem::Version.new("7.14.0")
37
+ # elasticsearch-ruby 7.14.0's elasticsearch-transport does not extends
38
+ # Elasticsearch class on Transport.
39
+ # This is why #host_unreachable_exceptions is not callable directly
40
+ # via transport (not transport's transport instance accessor) any more.
41
+ client.transport.transport.host_unreachable_exceptions
42
+ else
43
+ client.transport.host_unreachable_exceptions
44
+ end
45
+ end
46
+
35
47
  def retry_operate(max_retries, fail_on_retry_exceed = true, catch_trasport_exceptions = true)
36
48
  return unless block_given?
37
49
  retries = 0
38
50
  transport_errors = Elasticsearch::Transport::Transport::Errors.constants.map{ |c| Elasticsearch::Transport::Transport::Errors.const_get c } if catch_trasport_exceptions
39
51
  begin
40
52
  yield
41
- rescue *client.transport.host_unreachable_exceptions, *transport_errors, Timeout::Error => e
53
+ rescue *host_unreachable_exceptions, *transport_errors, Timeout::Error => e
42
54
  @_es = nil
43
55
  @_es_info = nil
44
56
  if retries < max_retries
@@ -2,6 +2,7 @@
2
2
  require 'date'
3
3
  require 'excon'
4
4
  require 'elasticsearch'
5
+ require 'set'
5
6
  begin
6
7
  require 'elasticsearch/xpack'
7
8
  rescue LoadError
@@ -71,7 +72,7 @@ module Fluent::Plugin
71
72
  DEFAULT_TYPE_NAME_ES_7x = "_doc".freeze
72
73
  DEFAULT_TYPE_NAME = "fluentd".freeze
73
74
  DEFAULT_RELOAD_AFTER = -1
74
- TARGET_BULK_BYTES = 20 * 1024 * 1024
75
+ DEFAULT_TARGET_BULK_BYTES = -1
75
76
  DEFAULT_POLICY_ID = "logstash-policy"
76
77
 
77
78
  config_param :host, :string, :default => 'localhost'
@@ -165,7 +166,7 @@ EOC
165
166
  config_param :suppress_doc_wrap, :bool, :default => false
166
167
  config_param :ignore_exceptions, :array, :default => [], value_type: :string, :desc => "Ignorable exception list"
167
168
  config_param :exception_backup, :bool, :default => true, :desc => "Chunk backup flag when ignore exception occured"
168
- config_param :bulk_message_request_threshold, :size, :default => TARGET_BULK_BYTES
169
+ config_param :bulk_message_request_threshold, :size, :default => DEFAULT_TARGET_BULK_BYTES
169
170
  config_param :compression_level, :enum, list: [:no_compression, :best_speed, :best_compression, :default_compression], :default => :no_compression
170
171
  config_param :enable_ilm, :bool, :default => false
171
172
  config_param :ilm_policy_id, :string, :default => DEFAULT_POLICY_ID
@@ -175,6 +176,7 @@ EOC
175
176
  config_param :truncate_caches_interval, :time, :default => nil
176
177
  config_param :use_legacy_template, :bool, :default => true
177
178
  config_param :catch_transport_exception_on_retry, :bool, :default => true
179
+ config_param :target_index_affinity, :bool, :default => false
178
180
 
179
181
  config_section :metadata, param_name: :metainfo, multi: false do
180
182
  config_param :include_chunk_id, :bool, :default => false
@@ -834,13 +836,14 @@ EOC
834
836
  extract_placeholders(@host, chunk)
835
837
  end
836
838
 
839
+ affinity_target_indices = get_affinity_target_indices(chunk)
837
840
  chunk.msgpack_each do |time, record|
838
841
  next unless record.is_a? Hash
839
842
 
840
843
  record = inject_chunk_id_to_record_if_needed(record, chunk_id)
841
844
 
842
845
  begin
843
- meta, header, record = process_message(tag, meta, header, time, record, extracted_values)
846
+ meta, header, record = process_message(tag, meta, header, time, record, affinity_target_indices, extracted_values)
844
847
  info = if @include_index_in_url
845
848
  RequestInfo.new(host, meta.delete("_index".freeze), meta["_index".freeze], meta.delete("_alias".freeze))
846
849
  else
@@ -877,6 +880,42 @@ EOC
877
880
  end
878
881
  end
879
882
 
883
+ def target_index_affinity_enabled?()
884
+ @target_index_affinity && @logstash_format && @id_key && (@write_operation == UPDATE_OP || @write_operation == UPSERT_OP)
885
+ end
886
+
887
+ def get_affinity_target_indices(chunk)
888
+ indices = Hash.new
889
+ if target_index_affinity_enabled?()
890
+ id_key_accessor = record_accessor_create(@id_key)
891
+ ids = Set.new
892
+ chunk.msgpack_each do |time, record|
893
+ next unless record.is_a? Hash
894
+ begin
895
+ ids << id_key_accessor.call(record)
896
+ end
897
+ end
898
+ log.debug("Find affinity target_indices by quering on ES (write_operation #{@write_operation}) for ids: #{ids.to_a}")
899
+ options = {
900
+ :index => "#{logstash_prefix}#{@logstash_prefix_separator}*",
901
+ }
902
+ query = {
903
+ 'query' => { 'ids' => { 'values' => ids.to_a } },
904
+ '_source' => false,
905
+ 'sort' => [
906
+ {"_index" => {"order" => "desc"}}
907
+ ]
908
+ }
909
+ result = client.search(options.merge(:body => Yajl.dump(query)))
910
+ # There should be just one hit per _id, but in case there still is multiple, just the oldest index is stored to map
911
+ result['hits']['hits'].each do |hit|
912
+ indices[hit["_id"]] = hit["_index"]
913
+ log.debug("target_index for id: #{hit["_id"]} from es: #{hit["_index"]}")
914
+ end
915
+ end
916
+ indices
917
+ end
918
+
880
919
  def split_request?(bulk_message, info)
881
920
  # For safety.
882
921
  end
@@ -889,7 +928,7 @@ EOC
889
928
  false
890
929
  end
891
930
 
892
- def process_message(tag, meta, header, time, record, extracted_values)
931
+ def process_message(tag, meta, header, time, record, affinity_target_indices, extracted_values)
893
932
  logstash_prefix, logstash_dateformat, index_name, type_name, _template_name, _customize_template, _deflector_alias, application_name, pipeline, _ilm_policy_id = extracted_values
894
933
 
895
934
  if @flatten_hashes
@@ -930,6 +969,15 @@ EOC
930
969
  record[@tag_key] = tag
931
970
  end
932
971
 
972
+ # If affinity target indices map has value for this particular id, use it as target_index
973
+ if !affinity_target_indices.empty?
974
+ id_accessor = record_accessor_create(@id_key)
975
+ id_value = id_accessor.call(record)
976
+ if affinity_target_indices.key?(id_value)
977
+ target_index = affinity_target_indices[id_value]
978
+ end
979
+ end
980
+
933
981
  target_type_parent, target_type_child_key = @target_type_key ? get_parent_of(record, @target_type_key) : nil
934
982
  if target_type_parent && target_type_parent[target_type_child_key]
935
983
  target_type = target_type_parent.delete(target_type_child_key)
@@ -1,3 +1,4 @@
1
+
1
2
  require_relative 'out_elasticsearch'
2
3
 
3
4
  module Fluent::Plugin
@@ -8,6 +9,8 @@ module Fluent::Plugin
8
9
  helpers :event_emitter
9
10
 
10
11
  config_param :data_stream_name, :string
12
+ config_param :data_stream_ilm_name, :string, :default => :data_stream_name
13
+ config_param :data_stream_template_name, :string, :default => :data_stream_name
11
14
  # Elasticsearch 7.9 or later always support new style of index template.
12
15
  config_set_default :use_legacy_template, false
13
16
 
@@ -26,7 +29,7 @@ module Fluent::Plugin
26
29
 
27
30
  # ref. https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-create-data-stream.html
28
31
  unless placeholder?(:data_stream_name_placeholder, @data_stream_name)
29
- validate_data_stream_name
32
+ validate_data_stream_parameters
30
33
  else
31
34
  @use_placeholder = true
32
35
  @data_stream_names = []
@@ -36,8 +39,8 @@ module Fluent::Plugin
36
39
  unless @use_placeholder
37
40
  begin
38
41
  @data_stream_names = [@data_stream_name]
39
- create_ilm_policy(@data_stream_name)
40
- create_index_template(@data_stream_name)
42
+ create_ilm_policy(@data_stream_name, @data_stream_template_name, @data_stream_ilm_name, @host)
43
+ create_index_template(@data_stream_name, @data_stream_template_name, @data_stream_ilm_name, @host)
41
44
  create_data_stream(@data_stream_name)
42
45
  rescue => e
43
46
  raise Fluent::ConfigError, "Failed to create data stream: <#{@data_stream_name}> #{e.message}"
@@ -45,31 +48,35 @@ module Fluent::Plugin
45
48
  end
46
49
  end
47
50
 
48
- def validate_data_stream_name
49
- unless valid_data_stream_name?
50
- unless start_with_valid_characters?
51
- if not_dots?
52
- raise Fluent::ConfigError, "'data_stream_name' must not start with #{INVALID_START_CHRACTERS.join(",")}: <#{@data_stream_name}>"
53
- else
54
- raise Fluent::ConfigError, "'data_stream_name' must not be . or ..: <#{@data_stream_name}>"
51
+ def validate_data_stream_parameters
52
+ {"data_stream_name" => @data_stream_name,
53
+ "data_stream_template_name"=> @data_stream_template_name,
54
+ "data_stream_ilm_name" => @data_stream_ilm_name}.each do |parameter, value|
55
+ unless valid_data_stream_parameters?(value)
56
+ unless start_with_valid_characters?(value)
57
+ if not_dots?(value)
58
+ raise Fluent::ConfigError, "'#{parameter}' must not start with #{INVALID_START_CHRACTERS.join(",")}: <#{value}>"
59
+ else
60
+ raise Fluent::ConfigError, "'#{parameter}' must not be . or ..: <#{value}>"
61
+ end
62
+ end
63
+ unless valid_characters?(value)
64
+ raise Fluent::ConfigError, "'#{parameter}' must not contain invalid characters #{INVALID_CHARACTERS.join(",")}: <#{value}>"
65
+ end
66
+ unless lowercase_only?(value)
67
+ raise Fluent::ConfigError, "'#{parameter}' must be lowercase only: <#{value}>"
68
+ end
69
+ if value.bytes.size > 255
70
+ raise Fluent::ConfigError, "'#{parameter}' must not be longer than 255 bytes: <#{value}>"
55
71
  end
56
- end
57
- unless valid_characters?
58
- raise Fluent::ConfigError, "'data_stream_name' must not contain invalid characters #{INVALID_CHARACTERS.join(",")}: <#{@data_stream_name}>"
59
- end
60
- unless lowercase_only?
61
- raise Fluent::ConfigError, "'data_stream_name' must be lowercase only: <#{@data_stream_name}>"
62
- end
63
- if @data_stream_name.bytes.size > 255
64
- raise Fluent::ConfigError, "'data_stream_name' must not be longer than 255 bytes: <#{@data_stream_name}>"
65
72
  end
66
73
  end
67
74
  end
68
75
 
69
- def create_ilm_policy(name)
70
- return if data_stream_exist?(name)
76
+ def create_ilm_policy(datastream_name, template_name, ilm_name, host)
77
+ return if data_stream_exist?(datastream_name) or template_exists?(template_name, host) or ilm_policy_exists?(ilm_name)
71
78
  params = {
72
- policy_id: "#{name}_policy",
79
+ policy_id: "#{ilm_name}_policy",
73
80
  body: File.read(File.join(File.dirname(__FILE__), "default-ilm-policy.json"))
74
81
  }
75
82
  retry_operate(@max_retry_putting_template,
@@ -79,19 +86,19 @@ module Fluent::Plugin
79
86
  end
80
87
  end
81
88
 
82
- def create_index_template(name)
83
- return if data_stream_exist?(name)
89
+ def create_index_template(datastream_name, template_name, ilm_name, host)
90
+ return if data_stream_exist?(datastream_name) or template_exists?(template_name, host)
84
91
  body = {
85
- "index_patterns" => ["#{name}*"],
92
+ "index_patterns" => ["#{datastream_name}*"],
86
93
  "data_stream" => {},
87
94
  "template" => {
88
95
  "settings" => {
89
- "index.lifecycle.name" => "#{name}_policy"
96
+ "index.lifecycle.name" => "#{ilm_name}_policy"
90
97
  }
91
98
  }
92
99
  }
93
100
  params = {
94
- name: name,
101
+ name: template_name,
95
102
  body: body
96
103
  }
97
104
  retry_operate(@max_retry_putting_template,
@@ -101,9 +108,9 @@ module Fluent::Plugin
101
108
  end
102
109
  end
103
110
 
104
- def data_stream_exist?(name)
111
+ def data_stream_exist?(datastream_name)
105
112
  params = {
106
- "name": name
113
+ name: datastream_name
107
114
  }
108
115
  begin
109
116
  response = @client.indices.get_data_stream(params)
@@ -114,10 +121,10 @@ module Fluent::Plugin
114
121
  end
115
122
  end
116
123
 
117
- def create_data_stream(name)
118
- return if data_stream_exist?(name)
124
+ def create_data_stream(datastream_name)
125
+ return if data_stream_exist?(datastream_name)
119
126
  params = {
120
- "name": name
127
+ name: datastream_name
121
128
  }
122
129
  retry_operate(@max_retry_putting_template,
123
130
  @fail_on_putting_template_retry_exceed,
@@ -126,28 +133,48 @@ module Fluent::Plugin
126
133
  end
127
134
  end
128
135
 
129
- def valid_data_stream_name?
130
- lowercase_only? and
131
- valid_characters? and
132
- start_with_valid_characters? and
133
- not_dots? and
134
- @data_stream_name.bytes.size <= 255
136
+ def ilm_policy_exists?(policy_id)
137
+ begin
138
+ @client.ilm.get_policy(policy_id: policy_id)
139
+ true
140
+ rescue
141
+ false
142
+ end
143
+ end
144
+
145
+ def template_exists?(name, host = nil)
146
+ if @use_legacy_template
147
+ client(host).indices.get_template(:name => name)
148
+ else
149
+ client(host).indices.get_index_template(:name => name)
150
+ end
151
+ return true
152
+ rescue Elasticsearch::Transport::Transport::Errors::NotFound
153
+ return false
154
+ end
155
+
156
+ def valid_data_stream_parameters?(data_stream_parameter)
157
+ lowercase_only?(data_stream_parameter) and
158
+ valid_characters?(data_stream_parameter) and
159
+ start_with_valid_characters?(data_stream_parameter) and
160
+ not_dots?(data_stream_parameter) and
161
+ data_stream_parameter.bytes.size <= 255
135
162
  end
136
163
 
137
- def lowercase_only?
138
- @data_stream_name.downcase == @data_stream_name
164
+ def lowercase_only?(data_stream_parameter)
165
+ data_stream_parameter.downcase == data_stream_parameter
139
166
  end
140
167
 
141
- def valid_characters?
142
- not (INVALID_CHARACTERS.each.any? do |v| @data_stream_name.include?(v) end)
168
+ def valid_characters?(data_stream_parameter)
169
+ not (INVALID_CHARACTERS.each.any? do |v| data_stream_parameter.include?(v) end)
143
170
  end
144
171
 
145
- def start_with_valid_characters?
146
- not (INVALID_START_CHRACTERS.each.any? do |v| @data_stream_name.start_with?(v) end)
172
+ def start_with_valid_characters?(data_stream_parameter)
173
+ not (INVALID_START_CHRACTERS.each.any? do |v| data_stream_parameter.start_with?(v) end)
147
174
  end
148
175
 
149
- def not_dots?
150
- not (@data_stream_name == "." or @data_stream_name == "..")
176
+ def not_dots?(data_stream_parameter)
177
+ not (data_stream_parameter == "." or data_stream_parameter == "..")
151
178
  end
152
179
 
153
180
  def client_library_version
@@ -160,13 +187,18 @@ module Fluent::Plugin
160
187
 
161
188
  def write(chunk)
162
189
  data_stream_name = @data_stream_name
190
+ data_stream_template_name = @data_stream_template_name
191
+ data_stream_ilm_name = @data_stream_ilm_name
192
+ host = @host
163
193
  if @use_placeholder
164
194
  data_stream_name = extract_placeholders(@data_stream_name, chunk)
195
+ data_stream_template_name = extract_placeholders(@data_stream_template_name, chunk)
196
+ data_stream_ilm_name = extract_placeholders(@data_stream_ilm_name, chunk)
165
197
  unless @data_stream_names.include?(data_stream_name)
166
198
  begin
167
- create_ilm_policy(data_stream_name)
168
- create_index_template(data_stream_name)
169
199
  create_data_stream(data_stream_name)
200
+ create_ilm_policy(data_stream_name, data_stream_template_name, data_stream_ilm_name, host)
201
+ create_index_template(data_stream_name, data_stream_template_name, data_stream_ilm_name, host)
170
202
  @data_stream_names << data_stream_name
171
203
  rescue => e
172
204
  raise Fluent::ConfigError, "Failed to create data stream: <#{data_stream_name}> #{e.message}"
@@ -200,7 +232,7 @@ module Fluent::Plugin
200
232
  log.error "Could not bulk insert to Data Stream: #{data_stream_name} #{response}"
201
233
  end
202
234
  rescue => e
203
- log.error "Could not bulk insert to Data Stream: #{data_stream_name} #{e.message}"
235
+ raise RecoverableRequestFailure, "could not push logs to Elasticsearch cluster (#{data_stream_name}): #{e.message}"
204
236
  end
205
237
  end
206
238