logstash-input-elasticsearch 5.0.0 → 5.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: b34b6c6d814152e88f320525ea0bb80bbf1e63ff962e022aaac0a2385dd087b6
4
- data.tar.gz: d142df9148ad69bf838d62badeec71382118741938db61e6aad0676bdb918a37
3
+ metadata.gz: dc85b0081373116cbedc717e9da3e383c8ec17288ae6fbd57cb0ed3878d5e954
4
+ data.tar.gz: 33feb6083ba4c7ce074517f366f2ad079d40ab25238841559759fcadae9f8e04
5
5
  SHA512:
6
- metadata.gz: 19b2b1325ded83b5b93966365f855f104ba1881f2c991ffdbe92216e08d12d18a7b3ddd4a14d755f6d55c85c98e00d12ca566188c63706d6db1f0aa5b085048b
7
- data.tar.gz: ff5de17e75281d8ddd0be70167f2c4dee0a90eef328c7e486b704e79fe10db7b7108b733f77438386a7abb18d504efbef5aaf7b0f34a6c8edd62791640514b7b
6
+ metadata.gz: acde0d0c551d2f91f8dea194499dedec6e3285ea4149a0a15111484e1a95d13e97a38fdc97cbe36d57b554aa7092e4fdc6e3214cf901f44315a6855356a25c67
7
+ data.tar.gz: 18d066e72ff514e0c2ba0777a6f5f755424b2873015b0f0417100dd18124d0caaf4ef7e8ca72edc89548c159628358620ff75a2f90be1673ae00516c69490caa
data/CHANGELOG.md CHANGED
@@ -1,3 +1,13 @@
1
+ ## 5.1.0
2
+ - Add "cursor"-like index tracking [#205](https://github.com/logstash-plugins/logstash-input-elasticsearch/pull/205)
3
+
4
+ ## 5.0.2
5
+ - Add elastic-transport client support used in elasticsearch-ruby 8.x [#223](https://github.com/logstash-plugins/logstash-input-elasticsearch/pull/223)
6
+
7
+ ## 5.0.1
8
+ - Fix: prevent plugin crash when hits contain illegal structure [#218](https://github.com/logstash-plugins/logstash-input-elasticsearch/pull/218)
9
+ - When a hit cannot be converted to an event, the input now emits an event tagged with `_elasticsearch_input_failure` with an `[event][original]` containing a JSON-encoded string representation of the entire hit.
10
+
1
11
  ## 5.0.0
2
12
  - SSL settings that were marked deprecated in version `4.17.0` are now marked obsolete, and will prevent the plugin from starting.
3
13
  - These settings are:
@@ -5,6 +15,7 @@
5
15
  - `ca_file`, which should bre replaced by `ssl_certificate_authorities`
6
16
  - `ssl_certificate_verification`, which should bre replaced by `ssl_verification_mode`
7
17
  - [#213](https://github.com/logstash-plugins/logstash-input-elasticsearch/pull/213)
18
+ - Add support for custom headers [#207](https://github.com/logstash-plugins/logstash-input-elasticsearch/pull/207)
8
19
 
9
20
  ## 4.20.5
10
21
  - Add `x-elastic-product-origin` header to Elasticsearch requests [#211](https://github.com/logstash-plugins/logstash-input-elasticsearch/pull/211)
data/docs/index.asciidoc CHANGED
@@ -48,7 +48,7 @@ This would create an Elasticsearch query with the following format:
48
48
  "sort": [ "_doc" ]
49
49
  }'
50
50
 
51
-
51
+ [id="plugins-{type}s-{plugin}-scheduling"]
52
52
  ==== Scheduling
53
53
 
54
54
  Input from this plugin can be scheduled to run periodically according to a specific
@@ -93,6 +93,143 @@ The plugin logs a warning when ECS is enabled and `target` isn't set.
93
93
 
94
94
  TIP: Set the `target` option to avoid potential schema conflicts.
95
95
 
96
+ [id="plugins-{type}s-{plugin}-failure-handling"]
97
+ ==== Failure handling
98
+
99
+ When this input plugin cannot create a structured `Event` from a hit result, it will instead create an `Event` that is tagged with `_elasticsearch_input_failure` whose `[event][original]` is a JSON-encoded string representation of the entire hit.
100
+
101
+ Common causes are:
102
+
103
+ - When the hit result contains top-level fields that are {logstash-ref}/processing.html#reserved-fields[reserved in Logstash] but do not have the expected shape. Use the <<plugins-{type}s-{plugin}-target>> directive to avoid conflicts with the top-level namespace.
104
+ - When <<plugins-{type}s-{plugin}-docinfo>> is enabled and the docinfo fields cannot be merged into the hit result. Combine <<plugins-{type}s-{plugin}-target>> and <<plugins-{type}s-{plugin}-docinfo_target>> to avoid conflict.
105
+
106
+ [id="plugins-{type}s-{plugin}-cursor"]
107
+ ==== Tracking a field's value across runs
108
+
109
+ .Technical Preview: Tracking a field's value
110
+ ****
111
+ The feature that allows tracking a field's value across runs is in _Technical Preview_.
112
+ Configuration options and implementation details are subject to change in minor releases without being preceded by deprecation warnings.
113
+ ****
114
+
115
+ Some uses cases require tracking the value of a particular field between two jobs.
116
+ Examples include:
117
+
118
+ * avoiding the need to re-process the entire result set of a long query after an unplanned restart
119
+ * grabbing only new data from an index instead of processing the entire set on each job.
120
+
121
+ The Elasticsearch input plugin provides the <<plugins-{type}s-{plugin}-tracking_field>> and <<plugins-{type}s-{plugin}-tracking_field_seed>> options.
122
+ When <<plugins-{type}s-{plugin}-tracking_field>> is set, the plugin records the value of that field for the last document retrieved in a run into
123
+ a file.
124
+ (The file location defaults to <<plugins-{type}s-{plugin}-last_run_metadata_path>>.)
125
+
126
+ You can then inject this value in the query using the placeholder `:last_value`.
127
+ The value will be injected into the query before execution, and then updated after the query completes if new data was found.
128
+
129
+ This feature works best when:
130
+
131
+ * the query sorts by the tracking field,
132
+ * the timestamp field is added by {es}, and
133
+ * the field type has enough resolution so that two events are unlikely to have the same value.
134
+
135
+ Consider using a tracking field whose type is https://www.elastic.co/guide/en/elasticsearch/reference/current/date_nanos.html[date nanoseconds].
136
+ If the tracking field is of this data type, you can use an extra placeholder called `:present` to inject the nano-second based value of "now-30s".
137
+ This placeholder is useful as the right-hand side of a range filter, allowing the collection of
138
+ new data but leaving partially-searchable bulk request data to the next scheduled job.
139
+
140
+ [id="plugins-{type}s-{plugin}-tracking-sample"]
141
+ ===== Sample configuration: Track field value across runs
142
+
143
+ This section contains a series of steps to help you set up the "tailing" of data being written to a set of indices, using a date nanosecond field added by an Elasticsearch ingest pipeline and the `tracking_field` capability of this plugin.
144
+
145
+ . Create ingest pipeline that adds Elasticsearch's `_ingest.timestamp` field to the documents as `event.ingested`:
146
+ +
147
+ [source, json]
148
+ PUT _ingest/pipeline/my-pipeline
149
+ {
150
+ "processors": [
151
+ {
152
+ "script": {
153
+ "lang": "painless",
154
+ "source": "ctx.putIfAbsent(\"event\", [:]); ctx.event.ingested = metadata().now.format(DateTimeFormatter.ISO_INSTANT);"
155
+ }
156
+ }
157
+ ]
158
+ }
159
+
160
+ [start=2]
161
+ . Create an index mapping where the tracking field is of date nanosecond type and invokes the defined pipeline:
162
+ +
163
+ [source, json]
164
+ PUT /_template/my_template
165
+ {
166
+ "index_patterns": ["test-*"],
167
+ "settings": {
168
+ "index.default_pipeline": "my-pipeline",
169
+ },
170
+ "mappings": {
171
+ "properties": {
172
+ "event": {
173
+ "properties": {
174
+ "ingested": {
175
+ "type": "date_nanos",
176
+ "format": "strict_date_optional_time_nanos"
177
+ }
178
+ }
179
+ }
180
+ }
181
+ }
182
+ }
183
+
184
+ [start=3]
185
+ . Define a query that looks at all data of the indices, sorted by the tracking field, and with a range filter since the last value seen until present:
186
+ +
187
+ [source,json]
188
+ {
189
+ "query": {
190
+ "range": {
191
+ "event.ingested": {
192
+ "gt": ":last_value",
193
+ "lt": ":present"
194
+ }
195
+ }
196
+ },
197
+ "sort": [
198
+ {
199
+ "event.ingested": {
200
+ "order": "asc",
201
+ "format": "strict_date_optional_time_nanos",
202
+ "numeric_type": "date_nanos"
203
+ }
204
+ }
205
+ ]
206
+ }
207
+
208
+ [start=4]
209
+ . Configure the Elasticsearch input to query the indices with the query defined above, every minute, and track the `event.ingested` field:
210
+ +
211
+ [source, ruby]
212
+ input {
213
+ elasticsearch {
214
+ id => tail_test_index
215
+ hosts => [ 'https://..']
216
+ api_key => '....'
217
+ index => 'test-*'
218
+ query => '{ "query": { "range": { "event.ingested": { "gt": ":last_value", "lt": ":present"}}}, "sort": [ { "event.ingested": {"order": "asc", "format": "strict_date_optional_time_nanos", "numeric_type" : "date_nanos" } } ] }'
219
+ tracking_field => "[event][ingested]"
220
+ slices => 5 # optional use of slices to speed data processing, should be equal to or less than number of primary shards
221
+ schedule => '* * * * *' # every minute
222
+ schedule_overlap => false # don't accumulate jobs if one takes longer than 1 minute
223
+ }
224
+ }
225
+
226
+ With this sample setup, new documents are indexed into a `test-*` index.
227
+ The next scheduled run:
228
+
229
+ * selects all new documents since the last observed value of the tracking field,
230
+ * uses {ref}/point-in-time-api.html#point-in-time-api[Point in time (PIT)] + {ref}/paginate-search-results.html#search-after[Search after] to paginate through all the data, and
231
+ * updates the value of the field at the end of the pagination.
232
+
96
233
  [id="plugins-{type}s-{plugin}-options"]
97
234
  ==== Elasticsearch Input configuration options
98
235
 
@@ -101,9 +238,6 @@ This plugin supports these configuration options plus the <<plugins-{type}s-{plu
101
238
  NOTE: As of version `5.0.0` of this plugin, a number of previously deprecated settings related to SSL have been removed.
102
239
  Please check out <<plugins-{type}s-{plugin}-obsolete-options>> for details.
103
240
 
104
- NOTE: As of version `5.0.0` of this plugin, a number of previously deprecated settings related to SSL have been removed.
105
- Please check out <<plugins-{type}s-{plugin}-obsolete-options>> for details.
106
-
107
241
  [cols="<,<,<",options="header",]
108
242
  |=======================================================================
109
243
  |Setting |Input type|Required
@@ -119,12 +253,14 @@ Please check out <<plugins-{type}s-{plugin}-obsolete-options>> for details.
119
253
  | <<plugins-{type}s-{plugin}-ecs_compatibility>> |<<string,string>>|No
120
254
  | <<plugins-{type}s-{plugin}-hosts>> |<<array,array>>|No
121
255
  | <<plugins-{type}s-{plugin}-index>> |<<string,string>>|No
256
+ | <<plugins-{type}s-{plugin}-last_run_metadata_path>> |<<string,string>>|No
122
257
  | <<plugins-{type}s-{plugin}-password>> |<<password,password>>|No
123
258
  | <<plugins-{type}s-{plugin}-proxy>> |<<uri,uri>>|No
124
259
  | <<plugins-{type}s-{plugin}-query>> |<<string,string>>|No
125
260
  | <<plugins-{type}s-{plugin}-response_type>> |<<string,string>>, one of `["hits","aggregations"]`|No
126
261
  | <<plugins-{type}s-{plugin}-request_timeout_seconds>> | <<number,number>>|No
127
262
  | <<plugins-{type}s-{plugin}-schedule>> |<<string,string>>|No
263
+ | <<plugins-{type}s-{plugin}-schedule_overlap>> |<<boolean,boolean>>|No
128
264
  | <<plugins-{type}s-{plugin}-scroll>> |<<string,string>>|No
129
265
  | <<plugins-{type}s-{plugin}-search_api>> |<<string,string>>, one of `["auto", "search_after", "scroll"]`|No
130
266
  | <<plugins-{type}s-{plugin}-size>> |<<number,number>>|No
@@ -144,6 +280,8 @@ Please check out <<plugins-{type}s-{plugin}-obsolete-options>> for details.
144
280
  | <<plugins-{type}s-{plugin}-ssl_verification_mode>> |<<string,string>>, one of `["full", "none"]`|No
145
281
  | <<plugins-{type}s-{plugin}-socket_timeout_seconds>> | <<number,number>>|No
146
282
  | <<plugins-{type}s-{plugin}-target>> | {logstash-ref}/field-references-deepdive.html[field reference] | No
283
+ | <<plugins-{type}s-{plugin}-tracking_field>> |<<string,string>>|No
284
+ | <<plugins-{type}s-{plugin}-tracking_field_seed>> |<<string,string>>|No
147
285
  | <<plugins-{type}s-{plugin}-retries>> | <<number,number>>|No
148
286
  | <<plugins-{type}s-{plugin}-user>> |<<string,string>>|No
149
287
  |=======================================================================
@@ -323,6 +461,17 @@ Check out {ref}/api-conventions.html#api-multi-index[Multi Indices
323
461
  documentation] in the Elasticsearch documentation for info on
324
462
  referencing multiple indices.
325
463
 
464
+ [id="plugins-{type}s-{plugin}-last_run_metadata_path"]
465
+ ===== `last_run_metadata_path`
466
+
467
+ * Value type is <<string,string>>
468
+ * There is no default value for this setting.
469
+
470
+ The path to store the last observed value of the tracking field, when used.
471
+ By default this file is stored as `<path.data>/plugins/inputs/elasticsearch/<pipeline_id>/last_run_value`.
472
+
473
+ This setting should point to file, not a directory, and Logstash must have read+write access to this file.
474
+
326
475
  [id="plugins-{type}s-{plugin}-password"]
327
476
  ===== `password`
328
477
 
@@ -403,6 +552,19 @@ for example: "* * * * *" (execute query every minute, on the minute)
403
552
  There is no schedule by default. If no schedule is given, then the statement is run
404
553
  exactly once.
405
554
 
555
+ [id="plugins-{type}s-{plugin}-schedule_overlap"]
556
+ ===== `schedule_overlap`
557
+
558
+ * Value type is <<boolean,boolean>>
559
+ * Default value is `true`
560
+
561
+ Whether to allow queuing of a scheduled run if a run is occurring.
562
+ While this is ideal for ensuring a new run happens immediately after the previous on finishes if there
563
+ is a lot of work to do, but given the queue is unbounded it may lead to an out of memory over long periods of time
564
+ if the queue grows continuously.
565
+
566
+ When in doubt, set `schedule_overlap` to false (it may become the default value in the future).
567
+
406
568
  [id="plugins-{type}s-{plugin}-scroll"]
407
569
  ===== `scroll`
408
570
 
@@ -615,6 +777,28 @@ When the `target` is set to a field reference, the `_source` of the hit is place
615
777
  This option can be useful to avoid populating unknown fields when a downstream schema such as ECS is enforced.
616
778
  It is also possible to target an entry in the event's metadata, which will be available during event processing but not exported to your outputs (e.g., `target \=> "[@metadata][_source]"`).
617
779
 
780
+ [id="plugins-{type}s-{plugin}-tracking_field"]
781
+ ===== `tracking_field`
782
+
783
+ * Value type is <<string,string>>
784
+ * There is no default value for this setting.
785
+
786
+ Which field from the last event of a previous run will be used a cursor value for the following run.
787
+ The value of this field is injected into each query if the query uses the placeholder `:last_value`.
788
+ For the first query after a pipeline is started, the value used is either read from <<plugins-{type}s-{plugin}-last_run_metadata_path>> file,
789
+ or taken from <<plugins-{type}s-{plugin}-tracking_field_seed>> setting.
790
+
791
+ Note: The tracking value is updated after each page is read and at the end of each Point in Time. In case of a crash the last saved value will be used so some duplication of data can occur. For this reason the use of unique document IDs for each event is recommended in the downstream destination.
792
+
793
+ [id="plugins-{type}s-{plugin}-tracking_field_seed"]
794
+ ===== `tracking_field_seed`
795
+
796
+ * Value type is <<string,string>>
797
+ * Default value is `"1970-01-01T00:00:00.000000000Z"`
798
+
799
+ The starting value for the <<plugins-{type}s-{plugin}-tracking_field>> if there is no <<plugins-{type}s-{plugin}-last_run_metadata_path>> already.
800
+ This field defaults to the nanosecond precision ISO8601 representation of `epoch`, or "1970-01-01T00:00:00.000000000Z", given nano-second precision timestamps are the
801
+ most reliable data format to use for this feature.
618
802
 
619
803
  [id="plugins-{type}s-{plugin}-user"]
620
804
  ===== `user`
@@ -12,14 +12,9 @@ module LogStash
12
12
  @client = client
13
13
  @plugin_params = plugin.params
14
14
 
15
+ @index = @plugin_params["index"]
15
16
  @size = @plugin_params["size"]
16
- @query = @plugin_params["query"]
17
17
  @retries = @plugin_params["retries"]
18
- @agg_options = {
19
- :index => @plugin_params["index"],
20
- :size => 0
21
- }.merge(:body => @query)
22
-
23
18
  @plugin = plugin
24
19
  end
25
20
 
@@ -33,10 +28,18 @@ module LogStash
33
28
  false
34
29
  end
35
30
 
36
- def do_run(output_queue)
31
+ def aggregation_options(query_object)
32
+ {
33
+ :index => @index,
34
+ :size => 0,
35
+ :body => query_object
36
+ }
37
+ end
38
+
39
+ def do_run(output_queue, query_object)
37
40
  logger.info("Aggregation starting")
38
41
  r = retryable(AGGREGATION_JOB) do
39
- @client.search(@agg_options)
42
+ @client.search(aggregation_options(query_object))
40
43
  end
41
44
  @plugin.push_hit(r, output_queue, 'aggregations') if r
42
45
  end
@@ -0,0 +1,58 @@
1
+ require 'fileutils'
2
+
3
+ module LogStash; module Inputs; class Elasticsearch
4
+ class CursorTracker
5
+ include LogStash::Util::Loggable
6
+
7
+ attr_reader :last_value
8
+
9
+ def initialize(last_run_metadata_path:, tracking_field:, tracking_field_seed:)
10
+ @last_run_metadata_path = last_run_metadata_path
11
+ @last_value_hashmap = Java::java.util.concurrent.ConcurrentHashMap.new
12
+ @last_value = IO.read(@last_run_metadata_path) rescue nil || tracking_field_seed
13
+ @tracking_field = tracking_field
14
+ logger.info "Starting value for cursor field \"#{@tracking_field}\": #{@last_value}"
15
+ @mutex = Mutex.new
16
+ end
17
+
18
+ def checkpoint_cursor(intermediate: true)
19
+ @mutex.synchronize do
20
+ if intermediate
21
+ # in intermediate checkpoints pick the smallest
22
+ converge_last_value {|v1, v2| v1 < v2 ? v1 : v2}
23
+ else
24
+ # in the last search of a PIT choose the largest
25
+ converge_last_value {|v1, v2| v1 > v2 ? v1 : v2}
26
+ @last_value_hashmap.clear
27
+ end
28
+ IO.write(@last_run_metadata_path, @last_value)
29
+ end
30
+ end
31
+
32
+ def converge_last_value(&block)
33
+ return if @last_value_hashmap.empty?
34
+ new_last_value = @last_value_hashmap.reduceValues(1000, &block)
35
+ logger.debug? && logger.debug("converge_last_value: got #{@last_value_hashmap.values.inspect}. won: #{new_last_value}")
36
+ return if new_last_value == @last_value
37
+ @last_value = new_last_value
38
+ logger.info "New cursor value for field \"#{@tracking_field}\" is: #{new_last_value}"
39
+ end
40
+
41
+ def record_last_value(event)
42
+ value = event.get(@tracking_field)
43
+ logger.trace? && logger.trace("storing last_value if #{@tracking_field} for #{Thread.current.object_id}: #{value}")
44
+ @last_value_hashmap.put(Thread.current.object_id, value)
45
+ end
46
+
47
+ def inject_cursor(query_json)
48
+ # ":present" means "now - 30s" to avoid grabbing partially visible data in the PIT
49
+ result = query_json.gsub(":last_value", @last_value.to_s).gsub(":present", now_minus_30s)
50
+ logger.debug("inject_cursor: injected values for ':last_value' and ':present'", :query => result)
51
+ result
52
+ end
53
+
54
+ def now_minus_30s
55
+ Java::java.time.Instant.now.minusSeconds(30).to_s
56
+ end
57
+ end
58
+ end; end; end
@@ -21,9 +21,10 @@ module LogStash
21
21
  @pipeline_id = plugin.pipeline_id
22
22
  end
23
23
 
24
- def do_run(output_queue)
25
- return retryable_search(output_queue) if @slices.nil? || @slices <= 1
24
+ def do_run(output_queue, query)
25
+ @query = query
26
26
 
27
+ return retryable_search(output_queue) if @slices.nil? || @slices <= 1
27
28
  retryable_slice_search(output_queue)
28
29
  end
29
30
 
@@ -122,6 +123,13 @@ module LogStash
122
123
  PIT_JOB = "create point in time (PIT)"
123
124
  SEARCH_AFTER_JOB = "search_after paginated search"
124
125
 
126
+ attr_accessor :cursor_tracker
127
+
128
+ def do_run(output_queue, query)
129
+ super(output_queue, query)
130
+ @cursor_tracker.checkpoint_cursor(intermediate: false) if @cursor_tracker
131
+ end
132
+
125
133
  def pit?(id)
126
134
  !!id&.is_a?(String)
127
135
  end
@@ -192,6 +200,8 @@ module LogStash
192
200
  end
193
201
  end
194
202
 
203
+ @cursor_tracker.checkpoint_cursor(intermediate: true) if @cursor_tracker
204
+
195
205
  logger.info("Query completed", log_details)
196
206
  end
197
207
 
@@ -13,9 +13,7 @@ require "logstash/plugin_mixins/normalize_config_support"
13
13
  require "base64"
14
14
 
15
15
  require "elasticsearch"
16
- require "elasticsearch/transport/transport/http/manticore"
17
- require_relative "elasticsearch/patches/_elasticsearch_transport_http_manticore"
18
- require_relative "elasticsearch/patches/_elasticsearch_transport_connections_selector"
16
+ require "manticore"
19
17
 
20
18
  # .Compatibility Note
21
19
  # [NOTE]
@@ -75,6 +73,7 @@ class LogStash::Inputs::Elasticsearch < LogStash::Inputs::Base
75
73
 
76
74
  require 'logstash/inputs/elasticsearch/paginated_search'
77
75
  require 'logstash/inputs/elasticsearch/aggregation'
76
+ require 'logstash/inputs/elasticsearch/cursor_tracker'
78
77
 
79
78
  include LogStash::PluginMixins::ECSCompatibilitySupport(:disabled, :v1, :v8 => :v1)
80
79
  include LogStash::PluginMixins::ECSCompatibilitySupport::TargetCheck
@@ -126,6 +125,20 @@ class LogStash::Inputs::Elasticsearch < LogStash::Inputs::Base
126
125
  # by this pipeline input.
127
126
  config :slices, :validate => :number
128
127
 
128
+ # Enable tracking the value of a given field to be used as a cursor
129
+ # Main concerns:
130
+ # * using anything other than _event.timestamp easily leads to data loss
131
+ # * the first "synchronization run can take a long time"
132
+ config :tracking_field, :validate => :string
133
+
134
+ # Define the initial seed value of the tracking_field
135
+ config :tracking_field_seed, :validate => :string, :default => "1970-01-01T00:00:00.000000000Z"
136
+
137
+ # The location of where the tracking field value will be stored
138
+ # The value is persisted after each scheduled run (and not per result)
139
+ # If it's not set it defaults to '${path.data}/plugins/inputs/elasticsearch/<pipeline_id>/last_run_value'
140
+ config :last_run_metadata_path, :validate => :string
141
+
129
142
  # If set, include Elasticsearch document information such as index, type, and
130
143
  # the id in the event.
131
144
  #
@@ -252,6 +265,10 @@ class LogStash::Inputs::Elasticsearch < LogStash::Inputs::Base
252
265
  # exactly once.
253
266
  config :schedule, :validate => :string
254
267
 
268
+ # Allow scheduled runs to overlap (enabled by default). Setting to false will
269
+ # only start a new scheduled run after the previous one completes.
270
+ config :schedule_overlap, :validate => :boolean
271
+
255
272
  # If set, the _source of each hit will be added nested under the target instead of at the top-level
256
273
  config :target, :validate => :field_reference
257
274
 
@@ -316,7 +333,7 @@ class LogStash::Inputs::Elasticsearch < LogStash::Inputs::Base
316
333
  @client_options = {
317
334
  :hosts => hosts,
318
335
  :transport_options => transport_options,
319
- :transport_class => ::Elasticsearch::Transport::Transport::HTTP::Manticore,
336
+ :transport_class => get_transport_client_class,
320
337
  :ssl => ssl_options
321
338
  }
322
339
 
@@ -330,26 +347,55 @@ class LogStash::Inputs::Elasticsearch < LogStash::Inputs::Base
330
347
 
331
348
  setup_query_executor
332
349
 
350
+ setup_cursor_tracker
351
+
333
352
  @client
334
353
  end
335
354
 
336
355
  def run(output_queue)
337
356
  if @schedule
338
- scheduler.cron(@schedule) { @query_executor.do_run(output_queue) }
357
+ scheduler.cron(@schedule, :overlap => @schedule_overlap) do
358
+ @query_executor.do_run(output_queue, get_query_object())
359
+ end
339
360
  scheduler.join
340
361
  else
341
- @query_executor.do_run(output_queue)
362
+ @query_executor.do_run(output_queue, get_query_object())
363
+ end
364
+ end
365
+
366
+ def get_query_object
367
+ if @cursor_tracker
368
+ query = @cursor_tracker.inject_cursor(@query)
369
+ @logger.debug("new query is #{query}")
370
+ else
371
+ query = @query
342
372
  end
373
+ LogStash::Json.load(query)
343
374
  end
344
375
 
345
376
  ##
346
377
  # This can be called externally from the query_executor
347
378
  public
348
379
  def push_hit(hit, output_queue, root_field = '_source')
349
- event = targeted_event_factory.new_event hit[root_field]
350
- set_docinfo_fields(hit, event) if @docinfo
380
+ event = event_from_hit(hit, root_field)
351
381
  decorate(event)
352
382
  output_queue << event
383
+ record_last_value(event)
384
+ end
385
+
386
+ def record_last_value(event)
387
+ @cursor_tracker.record_last_value(event) if @tracking_field
388
+ end
389
+
390
+ def event_from_hit(hit, root_field)
391
+ event = targeted_event_factory.new_event hit[root_field]
392
+ set_docinfo_fields(hit, event) if @docinfo
393
+
394
+ event
395
+ rescue => e
396
+ serialized_hit = hit.to_json
397
+ logger.warn("Event creation error, original data now in [event][original] field", message: e.message, exception: e.class, data: serialized_hit)
398
+ return event_factory.new_event('event' => { 'original' => serialized_hit }, 'tags' => ['_elasticsearch_input_failure'])
353
399
  end
354
400
 
355
401
  def set_docinfo_fields(hit, event)
@@ -357,10 +403,8 @@ class LogStash::Inputs::Elasticsearch < LogStash::Inputs::Base
357
403
  docinfo_target = event.get(@docinfo_target) || {}
358
404
 
359
405
  unless docinfo_target.is_a?(Hash)
360
- @logger.error("Incompatible Event, incompatible type for the docinfo_target=#{@docinfo_target} field in the `_source` document, expected a hash got:", :docinfo_target_type => docinfo_target.class, :event => event.to_hash_with_metadata)
361
-
362
- # TODO: (colin) I am not sure raising is a good strategy here?
363
- raise Exception.new("Elasticsearch input: incompatible event")
406
+ # expect error to be handled by `#event_from_hit`
407
+ fail RuntimeError, "Incompatible event; unable to merge docinfo fields into docinfo_target=`#{@docinfo_target}`"
364
408
  end
365
409
 
366
410
  @docinfo_fields.each do |field|
@@ -634,6 +678,42 @@ class LogStash::Inputs::Elasticsearch < LogStash::Inputs::Base
634
678
  end
635
679
  end
636
680
 
681
+ def setup_cursor_tracker
682
+ return unless @tracking_field
683
+ return unless @query_executor.is_a?(LogStash::Inputs::Elasticsearch::SearchAfter)
684
+
685
+ if @resolved_search_api != "search_after" || @response_type != "hits"
686
+ raise ConfigurationError.new("The `tracking_field` feature can only be used with `search_after` non-aggregation queries")
687
+ end
688
+
689
+ @cursor_tracker = CursorTracker.new(last_run_metadata_path: last_run_metadata_path,
690
+ tracking_field: @tracking_field,
691
+ tracking_field_seed: @tracking_field_seed)
692
+ @query_executor.cursor_tracker = @cursor_tracker
693
+ end
694
+
695
+ def last_run_metadata_path
696
+ return @last_run_metadata_path if @last_run_metadata_path
697
+
698
+ last_run_metadata_path = ::File.join(LogStash::SETTINGS.get_value("path.data"), "plugins", "inputs", "elasticsearch", pipeline_id, "last_run_value")
699
+ FileUtils.mkdir_p ::File.dirname(last_run_metadata_path)
700
+ last_run_metadata_path
701
+ end
702
+
703
+ def get_transport_client_class
704
+ # LS-core includes `elasticsearch` gem. The gem is composed of two separate gems: `elasticsearch-api` and `elasticsearch-transport`
705
+ # And now `elasticsearch-transport` is old, instead we have `elastic-transport`.
706
+ # LS-core updated `elasticsearch` > 8: https://github.com/elastic/logstash/pull/17161
707
+ # Following source bits are for the compatibility to support both `elasticsearch-transport` and `elastic-transport` gems
708
+ require "elasticsearch/transport/transport/http/manticore"
709
+ require_relative "elasticsearch/patches/_elasticsearch_transport_http_manticore"
710
+ require_relative "elasticsearch/patches/_elasticsearch_transport_connections_selector"
711
+ ::Elasticsearch::Transport::Transport::HTTP::Manticore
712
+ rescue ::LoadError
713
+ require "elastic/transport/transport/http/manticore"
714
+ ::Elastic::Transport::Transport::HTTP::Manticore
715
+ end
716
+
637
717
  module URIOrEmptyValidator
638
718
  ##
639
719
  # @override to provide :uri_or_empty validator
@@ -1,13 +1,13 @@
1
1
  Gem::Specification.new do |s|
2
2
 
3
3
  s.name = 'logstash-input-elasticsearch'
4
- s.version = '5.0.0'
4
+ s.version = '5.1.0'
5
5
  s.licenses = ['Apache License (2.0)']
6
6
  s.summary = "Reads query results from an Elasticsearch cluster"
7
7
  s.description = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program"
8
8
  s.authors = ["Elastic"]
9
9
  s.email = 'info@elastic.co'
10
- s.homepage = "http://www.elastic.co/guide/en/logstash/current/index.html"
10
+ s.homepage = "https://elastic.co/logstash"
11
11
  s.require_paths = ["lib"]
12
12
 
13
13
  # Files
@@ -26,7 +26,7 @@ Gem::Specification.new do |s|
26
26
  s.add_runtime_dependency "logstash-mixin-validator_support", '~> 1.0'
27
27
  s.add_runtime_dependency "logstash-mixin-scheduler", '~> 1.0'
28
28
 
29
- s.add_runtime_dependency 'elasticsearch', '>= 7.17.9'
29
+ s.add_runtime_dependency 'elasticsearch', '>= 7.17.9', '< 9'
30
30
  s.add_runtime_dependency 'logstash-mixin-ca_trusted_fingerprint_support', '~> 1.0'
31
31
  s.add_runtime_dependency 'logstash-mixin-normalize_config_support', '~>1.0'
32
32
 
@@ -0,0 +1 @@
1
+ 2024-12-26T22:27:15+00:00
@@ -1,20 +1,19 @@
1
1
  -----BEGIN CERTIFICATE-----
2
- MIIDSTCCAjGgAwIBAgIUUcAg9c8B8jiliCkOEJyqoAHrmccwDQYJKoZIhvcNAQEL
3
- BQAwNDEyMDAGA1UEAxMpRWxhc3RpYyBDZXJ0aWZpY2F0ZSBUb29sIEF1dG9nZW5l
4
- cmF0ZWQgQ0EwHhcNMjEwODEyMDUxNDU1WhcNMjQwODExMDUxNDU1WjA0MTIwMAYD
5
- VQQDEylFbGFzdGljIENlcnRpZmljYXRlIFRvb2wgQXV0b2dlbmVyYXRlZCBDQTCC
6
- ASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAK1HuusRuGNsztd4EQvqwcMr
7
- 8XvnNNaalerpMOorCGySEFrNf0HxDIVMGMCrOv1F8SvlcGq3XANs2MJ4F2xhhLZr
8
- PpqVHx+QnSZ66lu5R89QVSuMh/dCMxhNBlOA/dDlvy+EJBl9H791UGy/ChhSgaBd
9
- OKVyGkhjErRTeMIq7rR7UG6GL/fV+JGy41UiLrm1KQP7/XVD9UzZfGq/hylFkTPe
10
- oox5BUxdxUdDZ2creOID+agtIYuJVIkelKPQ+ljBY3kWBRexqJQsvyNUs1gZpjpz
11
- YUCzuVcXDRuJXYQXGqWXhsBPfJv+ZcSyMIBUfWT/G13cWU1iwufPy0NjajowPZsC
12
- AwEAAaNTMFEwHQYDVR0OBBYEFMgkye5+2l+TE0I6RsXRHjGBwpBGMB8GA1UdIwQY
13
- MBaAFMgkye5+2l+TE0I6RsXRHjGBwpBGMA8GA1UdEwEB/wQFMAMBAf8wDQYJKoZI
14
- hvcNAQELBQADggEBAIgtJW8sy5lBpzPRHkmWSS/SCZIPsABW+cHqQ3e0udrI3CLB
15
- G9n7yqAPWOBTbdqC2GM8dvAS/Twx4Bub/lWr84dFCu+t0mQq4l5kpJMVRS0KKXPL
16
- DwJbUN3oPNYy4uPn5Xi+XY3BYFce5vwJUsqIxeAbIOxVTNx++k5DFnB0ESAM23QL
17
- sgUZl7xl3/DkdO4oHj30gmTRW9bjCJ6umnHIiO3JoJatrprurUIt80vHC4Ndft36
18
- NBQ9mZpequ4RYjpSZNLcVsxyFAYwEY4g8MvH0MoMo2RRLfehmMCzXnI/Wh2qEyYz
19
- emHprBii/5y1HieKXlX9CZRb5qEPHckDVXW3znw=
2
+ MIIDFTCCAf2gAwIBAgIBATANBgkqhkiG9w0BAQsFADA0MTIwMAYDVQQDEylFbGFz
3
+ dGljIENlcnRpZmljYXRlIFRvb2wgQXV0b2dlbmVyYXRlZCBDQTAeFw0yNDEyMjYy
4
+ MjI3MTVaFw0yNTEyMjYyMjI3MTVaMDQxMjAwBgNVBAMTKUVsYXN0aWMgQ2VydGlm
5
+ aWNhdGUgVG9vbCBBdXRvZ2VuZXJhdGVkIENBMIIBIjANBgkqhkiG9w0BAQEFAAOC
6
+ AQ8AMIIBCgKCAQEArUe66xG4Y2zO13gRC+rBwyvxe+c01pqV6ukw6isIbJIQWs1/
7
+ QfEMhUwYwKs6/UXxK+VwardcA2zYwngXbGGEtms+mpUfH5CdJnrqW7lHz1BVK4yH
8
+ 90IzGE0GU4D90OW/L4QkGX0fv3VQbL8KGFKBoF04pXIaSGMStFN4wirutHtQboYv
9
+ 99X4kbLjVSIuubUpA/v9dUP1TNl8ar+HKUWRM96ijHkFTF3FR0NnZyt44gP5qC0h
10
+ i4lUiR6Uo9D6WMFjeRYFF7GolCy/I1SzWBmmOnNhQLO5VxcNG4ldhBcapZeGwE98
11
+ m/5lxLIwgFR9ZP8bXdxZTWLC58/LQ2NqOjA9mwIDAQABozIwMDAPBgNVHRMBAf8E
12
+ BTADAQH/MB0GA1UdDgQWBBTIJMnuftpfkxNCOkbF0R4xgcKQRjANBgkqhkiG9w0B
13
+ AQsFAAOCAQEAhfg/cmXc4Uh90yiXU8jOW8saQjTsq4ZMDQiLfJsNmNNYmHFN0vhv
14
+ lJRI1STdy7+GpjS5QbrMjQIxWSS8X8xysE4Rt81IrWmLuao35TRFyoiE1seBQ5sz
15
+ p/BxZUe57JvWi9dyzv2df4UfWFdGBhzdr80odZmz4i5VIv6qCKJKsGikcuLpepmp
16
+ E/UKnKHeR/dFWsxzA9P2OzHTUNBMOOA2PyAUL49pwoChwJeOWN/zAgwMWLbuHFG0
17
+ IN0u8swAmeH98QdvzbhiOatGNpqfTNvQEDc19yVjfXKpBVZQ79WtronYSqrbrUa1
18
+ T2zD8bIVP7CdddD/UmpT1SSKh4PJxudy5Q==
20
19
  -----END CERTIFICATE-----
@@ -1 +1 @@
1
- 195a7e7b1bc29f3d7913a918a44721704d27fa56facea0cd72a8093c7107c283
1
+ b1e955819b0d14f64f863adb103c248ddacf2e17bea48d04ee4b57c64814ccc4
@@ -0,0 +1,38 @@
1
+ -----BEGIN CERTIFICATE-----
2
+ MIIDIzCCAgugAwIBAgIBATANBgkqhkiG9w0BAQsFADA0MTIwMAYDVQQDEylFbGFz
3
+ dGljIENlcnRpZmljYXRlIFRvb2wgQXV0b2dlbmVyYXRlZCBDQTAeFw0yNDEyMjYy
4
+ MjI3MTVaFw0yNTEyMjYyMjI3MTVaMA0xCzAJBgNVBAMTAmVzMIIBIjANBgkqhkiG
5
+ 9w0BAQEFAAOCAQ8AMIIBCgKCAQEArZLZvLSWDK7Ul+AaBnjU81dsfaow8zOjCC5V
6
+ V21nXpYzQJoQbuWcvGYxwL7ZDs2ca4Wc8BVCj1NDduHuP7U+QIlUdQpl8kh5a0Zz
7
+ 36pcFw7UyF51/AzWixJrht/Azzkb5cpZtE22ZK0KhS4oCsjJmTN0EABAsGhDI9/c
8
+ MjNrUC7iP0dvfOuzAPp7ufY83h98jKKXUYV24snbbvmqoWI6GQQNSG/sEo1+1UGH
9
+ /z07/mVKoBAa5DVoNGvxN0fCE7vW7hkhT8+frJcsYFatAbnf6ql0KzEa8lN9u0gR
10
+ hQNM3zcKKsjEMomBzVBc4SV3KXO0d/jGdDtlqsm2oXqlTMdtGwIDAQABo2cwZTAY
11
+ BgNVHREEETAPgg1lbGFzdGljc2VhcmNoMAkGA1UdEwQCMAAwHQYDVR0OBBYEFFQU
12
+ K+6Cg2kExRj1xSDzEi4kkgKXMB8GA1UdIwQYMBaAFMgkye5+2l+TE0I6RsXRHjGB
13
+ wpBGMA0GCSqGSIb3DQEBCwUAA4IBAQB6cZ7IrDzcAoOZgAt9RlOe2yzQeH+alttp
14
+ CSQVINjJotS1WvmtqjBB6ArqLpXIGU89TZsktNe/NQJzgYSaMnlIuHVLFdxJYmwU
15
+ T1cP6VC/brmqP/dd5y7VWE7Lp+Wd5CxKl/WY+9chmgc+a1fW/lnPEJJ6pca1Bo8b
16
+ byIL0yY2IUv4R2eh1IyQl9oGH1GOPLgO7cY04eajxYcOVA2eDSItoyDtrJfkFP/P
17
+ UXtC1JAkvWKuujFEiBj0AannhroWlp3gvChhBwCuCAU0KXD6g8BE8tn6oT1+FW7J
18
+ avSfHxAe+VHtYhF8sJ8jrdm0d7E4GKS9UR/pkLAL1JuRdJ1VkPx3
19
+ -----END CERTIFICATE-----
20
+ -----BEGIN CERTIFICATE-----
21
+ MIIDFTCCAf2gAwIBAgIBATANBgkqhkiG9w0BAQsFADA0MTIwMAYDVQQDEylFbGFz
22
+ dGljIENlcnRpZmljYXRlIFRvb2wgQXV0b2dlbmVyYXRlZCBDQTAeFw0yNDEyMjYy
23
+ MjI3MTVaFw0yNTEyMjYyMjI3MTVaMDQxMjAwBgNVBAMTKUVsYXN0aWMgQ2VydGlm
24
+ aWNhdGUgVG9vbCBBdXRvZ2VuZXJhdGVkIENBMIIBIjANBgkqhkiG9w0BAQEFAAOC
25
+ AQ8AMIIBCgKCAQEArUe66xG4Y2zO13gRC+rBwyvxe+c01pqV6ukw6isIbJIQWs1/
26
+ QfEMhUwYwKs6/UXxK+VwardcA2zYwngXbGGEtms+mpUfH5CdJnrqW7lHz1BVK4yH
27
+ 90IzGE0GU4D90OW/L4QkGX0fv3VQbL8KGFKBoF04pXIaSGMStFN4wirutHtQboYv
28
+ 99X4kbLjVSIuubUpA/v9dUP1TNl8ar+HKUWRM96ijHkFTF3FR0NnZyt44gP5qC0h
29
+ i4lUiR6Uo9D6WMFjeRYFF7GolCy/I1SzWBmmOnNhQLO5VxcNG4ldhBcapZeGwE98
30
+ m/5lxLIwgFR9ZP8bXdxZTWLC58/LQ2NqOjA9mwIDAQABozIwMDAPBgNVHRMBAf8E
31
+ BTADAQH/MB0GA1UdDgQWBBTIJMnuftpfkxNCOkbF0R4xgcKQRjANBgkqhkiG9w0B
32
+ AQsFAAOCAQEAhfg/cmXc4Uh90yiXU8jOW8saQjTsq4ZMDQiLfJsNmNNYmHFN0vhv
33
+ lJRI1STdy7+GpjS5QbrMjQIxWSS8X8xysE4Rt81IrWmLuao35TRFyoiE1seBQ5sz
34
+ p/BxZUe57JvWi9dyzv2df4UfWFdGBhzdr80odZmz4i5VIv6qCKJKsGikcuLpepmp
35
+ E/UKnKHeR/dFWsxzA9P2OzHTUNBMOOA2PyAUL49pwoChwJeOWN/zAgwMWLbuHFG0
36
+ IN0u8swAmeH98QdvzbhiOatGNpqfTNvQEDc19yVjfXKpBVZQ79WtronYSqrbrUa1
37
+ T2zD8bIVP7CdddD/UmpT1SSKh4PJxudy5Q==
38
+ -----END CERTIFICATE-----
@@ -1,20 +1,19 @@
1
1
  -----BEGIN CERTIFICATE-----
2
- MIIDNjCCAh6gAwIBAgIUF9wE+oqGSbm4UVn1y9gEjzyaJFswDQYJKoZIhvcNAQEL
3
- BQAwNDEyMDAGA1UEAxMpRWxhc3RpYyBDZXJ0aWZpY2F0ZSBUb29sIEF1dG9nZW5l
4
- cmF0ZWQgQ0EwHhcNMjEwODEyMDUxNTI3WhcNMjQwODExMDUxNTI3WjANMQswCQYD
5
- VQQDEwJlczCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAK2S2by0lgyu
6
- 1JfgGgZ41PNXbH2qMPMzowguVVdtZ16WM0CaEG7lnLxmMcC+2Q7NnGuFnPAVQo9T
7
- Q3bh7j+1PkCJVHUKZfJIeWtGc9+qXBcO1MhedfwM1osSa4bfwM85G+XKWbRNtmSt
8
- CoUuKArIyZkzdBAAQLBoQyPf3DIza1Au4j9Hb3zrswD6e7n2PN4ffIyil1GFduLJ
9
- 2275qqFiOhkEDUhv7BKNftVBh/89O/5lSqAQGuQ1aDRr8TdHwhO71u4ZIU/Pn6yX
10
- LGBWrQG53+qpdCsxGvJTfbtIEYUDTN83CirIxDKJgc1QXOEldylztHf4xnQ7ZarJ
11
- tqF6pUzHbRsCAwEAAaNnMGUwHQYDVR0OBBYEFFQUK+6Cg2kExRj1xSDzEi4kkgKX
12
- MB8GA1UdIwQYMBaAFMgkye5+2l+TE0I6RsXRHjGBwpBGMBgGA1UdEQQRMA+CDWVs
13
- YXN0aWNzZWFyY2gwCQYDVR0TBAIwADANBgkqhkiG9w0BAQsFAAOCAQEAinaknZIc
14
- 7xtQNwUwa+kdET+I4lMz+TJw9vTjGKPJqe082n81ycKU5b+a/OndG90z+dTwhShW
15
- f0oZdIe/1rDCdiRU4ceCZA4ybKrFDIbW8gOKZOx9rsgEx9XNELj4ocZTBqxjQmNE
16
- Ho91fli5aEm0EL2vJgejh4hcfDeElQ6go9gtvAHQ57XEADQSenvt69jOICOupnS+
17
- LSjDVhv/VLi3CAip0B+lD5fX/DVQdrJ62eRGuQYxoouE3saCO58qUUrKB39yD9KA
18
- qRA/sVxyLogxaU+5dLfc0NJdOqSzStxQ2vdMvAWo9tZZ2UBGFrk5SdwCQe7Yv5mX
19
- qi02i4q6meHGcw==
2
+ MIIDIzCCAgugAwIBAgIBATANBgkqhkiG9w0BAQsFADA0MTIwMAYDVQQDEylFbGFz
3
+ dGljIENlcnRpZmljYXRlIFRvb2wgQXV0b2dlbmVyYXRlZCBDQTAeFw0yNDEyMjYy
4
+ MjI3MTVaFw0yNTEyMjYyMjI3MTVaMA0xCzAJBgNVBAMTAmVzMIIBIjANBgkqhkiG
5
+ 9w0BAQEFAAOCAQ8AMIIBCgKCAQEArZLZvLSWDK7Ul+AaBnjU81dsfaow8zOjCC5V
6
+ V21nXpYzQJoQbuWcvGYxwL7ZDs2ca4Wc8BVCj1NDduHuP7U+QIlUdQpl8kh5a0Zz
7
+ 36pcFw7UyF51/AzWixJrht/Azzkb5cpZtE22ZK0KhS4oCsjJmTN0EABAsGhDI9/c
8
+ MjNrUC7iP0dvfOuzAPp7ufY83h98jKKXUYV24snbbvmqoWI6GQQNSG/sEo1+1UGH
9
+ /z07/mVKoBAa5DVoNGvxN0fCE7vW7hkhT8+frJcsYFatAbnf6ql0KzEa8lN9u0gR
10
+ hQNM3zcKKsjEMomBzVBc4SV3KXO0d/jGdDtlqsm2oXqlTMdtGwIDAQABo2cwZTAY
11
+ BgNVHREEETAPgg1lbGFzdGljc2VhcmNoMAkGA1UdEwQCMAAwHQYDVR0OBBYEFFQU
12
+ K+6Cg2kExRj1xSDzEi4kkgKXMB8GA1UdIwQYMBaAFMgkye5+2l+TE0I6RsXRHjGB
13
+ wpBGMA0GCSqGSIb3DQEBCwUAA4IBAQB6cZ7IrDzcAoOZgAt9RlOe2yzQeH+alttp
14
+ CSQVINjJotS1WvmtqjBB6ArqLpXIGU89TZsktNe/NQJzgYSaMnlIuHVLFdxJYmwU
15
+ T1cP6VC/brmqP/dd5y7VWE7Lp+Wd5CxKl/WY+9chmgc+a1fW/lnPEJJ6pca1Bo8b
16
+ byIL0yY2IUv4R2eh1IyQl9oGH1GOPLgO7cY04eajxYcOVA2eDSItoyDtrJfkFP/P
17
+ UXtC1JAkvWKuujFEiBj0AannhroWlp3gvChhBwCuCAU0KXD6g8BE8tn6oT1+FW7J
18
+ avSfHxAe+VHtYhF8sJ8jrdm0d7E4GKS9UR/pkLAL1JuRdJ1VkPx3
20
19
  -----END CERTIFICATE-----
@@ -0,0 +1,15 @@
1
+ #!/usr/bin/env bash
2
+
3
+ set -e
4
+ cd "$(dirname "$0")"
5
+
6
+ openssl x509 -x509toreq -in ca.crt -copy_extensions copyall -signkey ca.key -out ca.csr
7
+ openssl x509 -req -copy_extensions copyall -days 365 -in ca.csr -set_serial 0x01 -signkey ca.key -out ca.crt && rm ca.csr
8
+ openssl x509 -in ca.crt -outform der | sha256sum | awk '{print $1}' > ca.der.sha256
9
+
10
+ openssl x509 -x509toreq -in es.crt -copy_extensions copyall -signkey es.key -out es.csr
11
+ openssl x509 -req -copy_extensions copyall -days 365 -in es.csr -set_serial 0x01 -CA ca.crt -CAkey ca.key -out es.crt && rm es.csr
12
+ cat es.crt ca.crt > es.chain.crt
13
+
14
+ # output ISO8601 timestamp to file
15
+ date -Iseconds > GENERATED_AT
@@ -0,0 +1,72 @@
1
+ # encoding: utf-8
2
+ require "logstash/devutils/rspec/spec_helper"
3
+ require "logstash/devutils/rspec/shared_examples"
4
+ require "logstash/inputs/elasticsearch"
5
+ require "logstash/inputs/elasticsearch/cursor_tracker"
6
+
7
+ describe LogStash::Inputs::Elasticsearch::CursorTracker do
8
+
9
+ let(:last_run_metadata_path) { Tempfile.new('cursor_tracker_testing').path }
10
+ let(:tracking_field_seed) { "1980-01-01T23:59:59.999999999Z" }
11
+ let(:options) do
12
+ {
13
+ :last_run_metadata_path => last_run_metadata_path,
14
+ :tracking_field => "my_field",
15
+ :tracking_field_seed => tracking_field_seed
16
+ }
17
+ end
18
+
19
+ subject { described_class.new(**options) }
20
+
21
+ it "creating a class works" do
22
+ expect(subject).to be_a described_class
23
+ end
24
+
25
+ describe "checkpoint_cursor" do
26
+ before(:each) do
27
+ subject.checkpoint_cursor(intermediate: false) # store seed value
28
+ [
29
+ Thread.new(subject) {|subject| subject.record_last_value(LogStash::Event.new("my_field" => "2025-01-03T23:59:59.999999999Z")) },
30
+ Thread.new(subject) {|subject| subject.record_last_value(LogStash::Event.new("my_field" => "2025-01-01T23:59:59.999999999Z")) },
31
+ Thread.new(subject) {|subject| subject.record_last_value(LogStash::Event.new("my_field" => "2025-01-02T23:59:59.999999999Z")) },
32
+ ].each(&:join)
33
+ end
34
+ context "when doing intermediate checkpoint" do
35
+ it "persists the smallest value" do
36
+ subject.checkpoint_cursor(intermediate: true)
37
+ expect(IO.read(last_run_metadata_path)).to eq("2025-01-01T23:59:59.999999999Z")
38
+ end
39
+ end
40
+ context "when doing non-intermediate checkpoint" do
41
+ it "persists the largest value" do
42
+ subject.checkpoint_cursor(intermediate: false)
43
+ expect(IO.read(last_run_metadata_path)).to eq("2025-01-03T23:59:59.999999999Z")
44
+ end
45
+ end
46
+ end
47
+
48
+ describe "inject_cursor" do
49
+ let(:new_value) { "2025-01-03T23:59:59.999999999Z" }
50
+ let(:fake_now) { "2026-09-19T23:59:59.999999999Z" }
51
+
52
+ let(:query) do
53
+ %q[
54
+ { "query": { "range": { "event.ingested": { "gt": :last_value, "lt": :present}}}, "sort": [ { "event.ingested": {"order": "asc", "format": "strict_date_optional_time_nanos", "numeric_type" : "date_nanos" } } ] }
55
+ ]
56
+ end
57
+
58
+ before(:each) do
59
+ subject.record_last_value(LogStash::Event.new("my_field" => new_value))
60
+ subject.checkpoint_cursor(intermediate: false)
61
+ allow(subject).to receive(:now_minus_30s).and_return(fake_now)
62
+ end
63
+
64
+ it "injects the value of the cursor into json query if it contains :last_value" do
65
+ expect(subject.inject_cursor(query)).to match(/#{new_value}/)
66
+ end
67
+
68
+ it "injects current time into json query if it contains :present" do
69
+ expect(subject.inject_cursor(query)).to match(/#{fake_now}/)
70
+ end
71
+ end
72
+ end
@@ -21,6 +21,13 @@ describe LogStash::Inputs::Elasticsearch, :ecs_compatibility_support do
21
21
  let(:es_version) { "7.5.0" }
22
22
  let(:cluster_info) { {"version" => {"number" => es_version, "build_flavor" => build_flavor}, "tagline" => "You Know, for Search"} }
23
23
 
24
+ def elastic_ruby_v8_client_available?
25
+ Elasticsearch::Transport
26
+ false
27
+ rescue NameError # NameError: uninitialized constant Elasticsearch::Transport if Elastic Ruby client is not available
28
+ true
29
+ end
30
+
24
31
  before(:each) do
25
32
  Elasticsearch::Client.send(:define_method, :ping) { } # define no-action ping method
26
33
  allow_any_instance_of(Elasticsearch::Client).to receive(:info).and_return(cluster_info)
@@ -92,9 +99,11 @@ describe LogStash::Inputs::Elasticsearch, :ecs_compatibility_support do
92
99
 
93
100
  before do
94
101
  allow(Elasticsearch::Client).to receive(:new).and_return(es_client)
95
- allow(es_client).to receive(:info).and_raise(
96
- Elasticsearch::Transport::Transport::Errors::BadRequest.new
97
- )
102
+ if elastic_ruby_v8_client_available?
103
+ allow(es_client).to receive(:info).and_raise(Elastic::Transport::Transport::Errors::BadRequest.new)
104
+ else
105
+ allow(es_client).to receive(:info).and_raise(Elasticsearch::Transport::Transport::Errors::BadRequest.new)
106
+ end
98
107
  end
99
108
 
100
109
  it "raises an exception" do
@@ -666,11 +675,28 @@ describe LogStash::Inputs::Elasticsearch, :ecs_compatibility_support do
666
675
  context 'if the `docinfo_target` exist but is not of type hash' do
667
676
  let(:config) { base_config.merge 'docinfo' => true, "docinfo_target" => 'metadata_with_string' }
668
677
  let(:do_register) { false }
678
+ let(:mock_queue) { double('Queue', :<< => nil) }
679
+ let(:hit) { response.dig('hits', 'hits').first }
680
+
681
+ it 'emits a tagged event with JSON-serialized event in [event][original]' do
682
+ allow(plugin).to receive(:logger).and_return(double('Logger').as_null_object)
669
683
 
670
- it 'raises an exception if the `docinfo_target` exist but is not of type hash' do
671
- expect(client).not_to receive(:clear_scroll)
672
684
  plugin.register
673
- expect { plugin.run([]) }.to raise_error(Exception, /incompatible event/)
685
+ plugin.run(mock_queue)
686
+
687
+ expect(mock_queue).to have_received(:<<) do |event|
688
+ expect(event).to be_a_kind_of LogStash::Event
689
+
690
+ expect(event.get('tags')).to include("_elasticsearch_input_failure")
691
+ expect(event.get('[event][original]')).to be_a_kind_of String
692
+ expect(JSON.load(event.get('[event][original]'))).to eq hit
693
+ end
694
+
695
+ expect(plugin.logger)
696
+ .to have_received(:warn).with(
697
+ a_string_including("Event creation error, original data now in [event][original] field"),
698
+ a_hash_including(:message => a_string_including('unable to merge docinfo fields into docinfo_target=`metadata_with_string`'),
699
+ :data => a_string_including('"_id":"C5b2xLQwTZa76jBmHIbwHQ"')))
674
700
  end
675
701
 
676
702
  end
@@ -727,8 +753,13 @@ describe LogStash::Inputs::Elasticsearch, :ecs_compatibility_support do
727
753
  it "should set host(s)" do
728
754
  plugin.register
729
755
  client = plugin.send(:client)
730
-
731
- expect( client.transport.instance_variable_get(:@seeds) ).to eql [{
756
+ target_field = :@seeds
757
+ begin
758
+ Elasticsearch::Transport::Client
759
+ rescue
760
+ target_field = :@hosts
761
+ end
762
+ expect( client.transport.instance_variable_get(target_field) ).to eql [{
732
763
  :scheme => "https",
733
764
  :host => "ac31ebb90241773157043c34fd26fd46.us-central1.gcp.cloud.es.io",
734
765
  :port => 9243,
@@ -1134,7 +1165,7 @@ describe LogStash::Inputs::Elasticsearch, :ecs_compatibility_support do
1134
1165
 
1135
1166
  context "when there's an exception" do
1136
1167
  before(:each) do
1137
- allow(client).to receive(:search).and_raise RuntimeError
1168
+ allow(client).to receive(:search).and_raise RuntimeError.new("test exception")
1138
1169
  end
1139
1170
  it 'produces no events' do
1140
1171
  plugin.run queue
@@ -1248,6 +1279,92 @@ describe LogStash::Inputs::Elasticsearch, :ecs_compatibility_support do
1248
1279
  end
1249
1280
  end
1250
1281
 
1282
+ context '#push_hit' do
1283
+ let(:config) do
1284
+ {
1285
+ 'docinfo' => true, # include ids
1286
+ 'docinfo_target' => '[@metadata][docinfo]'
1287
+ }
1288
+ end
1289
+
1290
+ let(:hit) do
1291
+ JSON.load(<<~EOJSON)
1292
+ {
1293
+ "_index" : "test_bulk_index_2",
1294
+ "_type" : "_doc",
1295
+ "_id" : "sHe6A3wBesqF7ydicQvG",
1296
+ "_score" : 1.0,
1297
+ "_source" : {
1298
+ "@timestamp" : "2021-09-20T15:02:02.557Z",
1299
+ "message" : "ping",
1300
+ "@version" : "17",
1301
+ "sequence" : 7,
1302
+ "host" : {
1303
+ "name" : "maybe.local",
1304
+ "ip" : "127.0.0.1"
1305
+ }
1306
+ }
1307
+ }
1308
+ EOJSON
1309
+ end
1310
+
1311
+ let(:mock_queue) { double('queue', :<< => nil) }
1312
+
1313
+ before(:each) do
1314
+ plugin.send(:setup_cursor_tracker)
1315
+ end
1316
+
1317
+ it 'pushes a generated event to the queue' do
1318
+ plugin.send(:push_hit, hit, mock_queue)
1319
+ expect(mock_queue).to have_received(:<<) do |event|
1320
+ expect(event).to be_a_kind_of LogStash::Event
1321
+
1322
+ # fields overriding defaults
1323
+ expect(event.timestamp.to_s).to eq("2021-09-20T15:02:02.557Z")
1324
+ expect(event.get('@version')).to eq("17")
1325
+
1326
+ # structure from hit's _source
1327
+ expect(event.get('message')).to eq("ping")
1328
+ expect(event.get('sequence')).to eq(7)
1329
+ expect(event.get('[host][name]')).to eq("maybe.local")
1330
+ expect(event.get('[host][ip]')).to eq("127.0.0.1")
1331
+
1332
+ # docinfo fields
1333
+ expect(event.get('[@metadata][docinfo][_index]')).to eq("test_bulk_index_2")
1334
+ expect(event.get('[@metadata][docinfo][_type]')).to eq("_doc")
1335
+ expect(event.get('[@metadata][docinfo][_id]')).to eq("sHe6A3wBesqF7ydicQvG")
1336
+ end
1337
+ end
1338
+
1339
+ context 'when event creation fails' do
1340
+ before(:each) do
1341
+ allow(plugin).to receive(:logger).and_return(double('Logger').as_null_object)
1342
+
1343
+ allow(plugin.event_factory).to receive(:new_event).and_call_original
1344
+ allow(plugin.event_factory).to receive(:new_event).with(a_hash_including hit['_source']).and_raise(RuntimeError, 'intentional')
1345
+ end
1346
+
1347
+ it 'pushes a tagged event containing a JSON-encoded hit in [event][original]' do
1348
+ plugin.send(:push_hit, hit, mock_queue)
1349
+
1350
+ expect(mock_queue).to have_received(:<<) do |event|
1351
+ expect(event).to be_a_kind_of LogStash::Event
1352
+
1353
+ expect(event.get('tags')).to include("_elasticsearch_input_failure")
1354
+ expect(event.get('[event][original]')).to be_a_kind_of String
1355
+ expect(JSON.load(event.get('[event][original]'))).to eq hit
1356
+ end
1357
+
1358
+ expect(plugin.logger)
1359
+ .to have_received(:warn).with(
1360
+ a_string_including("Event creation error, original data now in [event][original] field"),
1361
+ a_hash_including(:message => a_string_including('intentional'),
1362
+ :data => a_string_including('"_id":"sHe6A3wBesqF7ydicQvG"')))
1363
+
1364
+ end
1365
+ end
1366
+ end
1367
+
1251
1368
  # @note can be removed once we depends on elasticsearch gem >= 6.x
1252
1369
  def extract_transport(client) # on 7.x client.transport is a ES::Transport::Client
1253
1370
  client.transport.respond_to?(:transport) ? client.transport.transport : client.transport
@@ -4,7 +4,7 @@ require "logstash/plugin"
4
4
  require "logstash/inputs/elasticsearch"
5
5
  require_relative "../../../spec/es_helper"
6
6
 
7
- describe LogStash::Inputs::Elasticsearch, :integration => true do
7
+ describe LogStash::Inputs::Elasticsearch do
8
8
 
9
9
  SECURE_INTEGRATION = ENV['SECURE_INTEGRATION'].eql? 'true'
10
10
 
@@ -76,6 +76,14 @@ describe LogStash::Inputs::Elasticsearch, :integration => true do
76
76
  shared_examples 'secured_elasticsearch' do
77
77
  it_behaves_like 'an elasticsearch index plugin'
78
78
 
79
+ let(:unauth_exception_class) do
80
+ begin
81
+ Elasticsearch::Transport::Transport::Errors::Unauthorized
82
+ rescue
83
+ Elastic::Transport::Transport::Errors::Unauthorized
84
+ end
85
+ end
86
+
79
87
  context "incorrect auth credentials" do
80
88
 
81
89
  let(:config) do
@@ -85,7 +93,7 @@ describe LogStash::Inputs::Elasticsearch, :integration => true do
85
93
  let(:queue) { [] }
86
94
 
87
95
  it "fails to run the plugin" do
88
- expect { plugin.register }.to raise_error Elasticsearch::Transport::Transport::Errors::Unauthorized
96
+ expect { plugin.register }.to raise_error unauth_exception_class
89
97
  end
90
98
  end
91
99
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: logstash-input-elasticsearch
3
3
  version: !ruby/object:Gem::Version
4
- version: 5.0.0
4
+ version: 5.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Elastic
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2024-12-18 00:00:00.000000000 Z
11
+ date: 2025-04-07 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  requirement: !ruby/object:Gem::Requirement
@@ -92,6 +92,9 @@ dependencies:
92
92
  - - ">="
93
93
  - !ruby/object:Gem::Version
94
94
  version: 7.17.9
95
+ - - "<"
96
+ - !ruby/object:Gem::Version
97
+ version: '9'
95
98
  name: elasticsearch
96
99
  type: :runtime
97
100
  prerelease: false
@@ -100,6 +103,9 @@ dependencies:
100
103
  - - ">="
101
104
  - !ruby/object:Gem::Version
102
105
  version: 7.17.9
106
+ - - "<"
107
+ - !ruby/object:Gem::Version
108
+ version: '9'
103
109
  - !ruby/object:Gem::Dependency
104
110
  requirement: !ruby/object:Gem::Requirement
105
111
  requirements:
@@ -272,21 +278,26 @@ files:
272
278
  - lib/logstash/helpers/loggable_try.rb
273
279
  - lib/logstash/inputs/elasticsearch.rb
274
280
  - lib/logstash/inputs/elasticsearch/aggregation.rb
281
+ - lib/logstash/inputs/elasticsearch/cursor_tracker.rb
275
282
  - lib/logstash/inputs/elasticsearch/paginated_search.rb
276
283
  - lib/logstash/inputs/elasticsearch/patches/_elasticsearch_transport_connections_selector.rb
277
284
  - lib/logstash/inputs/elasticsearch/patches/_elasticsearch_transport_http_manticore.rb
278
285
  - logstash-input-elasticsearch.gemspec
279
286
  - spec/es_helper.rb
287
+ - spec/fixtures/test_certs/GENERATED_AT
280
288
  - spec/fixtures/test_certs/ca.crt
281
289
  - spec/fixtures/test_certs/ca.der.sha256
282
290
  - spec/fixtures/test_certs/ca.key
291
+ - spec/fixtures/test_certs/es.chain.crt
283
292
  - spec/fixtures/test_certs/es.crt
284
293
  - spec/fixtures/test_certs/es.key
294
+ - spec/fixtures/test_certs/renew.sh
295
+ - spec/inputs/cursor_tracker_spec.rb
285
296
  - spec/inputs/elasticsearch_spec.rb
286
297
  - spec/inputs/elasticsearch_ssl_spec.rb
287
298
  - spec/inputs/integration/elasticsearch_spec.rb
288
299
  - spec/inputs/paginated_search_spec.rb
289
- homepage: http://www.elastic.co/guide/en/logstash/current/index.html
300
+ homepage: https://elastic.co/logstash
290
301
  licenses:
291
302
  - Apache License (2.0)
292
303
  metadata:
@@ -313,11 +324,15 @@ specification_version: 4
313
324
  summary: Reads query results from an Elasticsearch cluster
314
325
  test_files:
315
326
  - spec/es_helper.rb
327
+ - spec/fixtures/test_certs/GENERATED_AT
316
328
  - spec/fixtures/test_certs/ca.crt
317
329
  - spec/fixtures/test_certs/ca.der.sha256
318
330
  - spec/fixtures/test_certs/ca.key
331
+ - spec/fixtures/test_certs/es.chain.crt
319
332
  - spec/fixtures/test_certs/es.crt
320
333
  - spec/fixtures/test_certs/es.key
334
+ - spec/fixtures/test_certs/renew.sh
335
+ - spec/inputs/cursor_tracker_spec.rb
321
336
  - spec/inputs/elasticsearch_spec.rb
322
337
  - spec/inputs/elasticsearch_ssl_spec.rb
323
338
  - spec/inputs/integration/elasticsearch_spec.rb