logstash-input-elasticsearch 5.0.0 → 5.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +11 -0
- data/docs/index.asciidoc +188 -4
- data/lib/logstash/inputs/elasticsearch/aggregation.rb +11 -8
- data/lib/logstash/inputs/elasticsearch/cursor_tracker.rb +58 -0
- data/lib/logstash/inputs/elasticsearch/paginated_search.rb +12 -2
- data/lib/logstash/inputs/elasticsearch.rb +92 -12
- data/logstash-input-elasticsearch.gemspec +3 -3
- data/spec/fixtures/test_certs/GENERATED_AT +1 -0
- data/spec/fixtures/test_certs/ca.crt +17 -18
- data/spec/fixtures/test_certs/ca.der.sha256 +1 -1
- data/spec/fixtures/test_certs/es.chain.crt +38 -0
- data/spec/fixtures/test_certs/es.crt +17 -18
- data/spec/fixtures/test_certs/renew.sh +15 -0
- data/spec/inputs/cursor_tracker_spec.rb +72 -0
- data/spec/inputs/elasticsearch_spec.rb +126 -9
- data/spec/inputs/integration/elasticsearch_spec.rb +10 -2
- metadata +18 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: dc85b0081373116cbedc717e9da3e383c8ec17288ae6fbd57cb0ed3878d5e954
|
4
|
+
data.tar.gz: 33feb6083ba4c7ce074517f366f2ad079d40ab25238841559759fcadae9f8e04
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: acde0d0c551d2f91f8dea194499dedec6e3285ea4149a0a15111484e1a95d13e97a38fdc97cbe36d57b554aa7092e4fdc6e3214cf901f44315a6855356a25c67
|
7
|
+
data.tar.gz: 18d066e72ff514e0c2ba0777a6f5f755424b2873015b0f0417100dd18124d0caaf4ef7e8ca72edc89548c159628358620ff75a2f90be1673ae00516c69490caa
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,13 @@
|
|
1
|
+
## 5.1.0
|
2
|
+
- Add "cursor"-like index tracking [#205](https://github.com/logstash-plugins/logstash-input-elasticsearch/pull/205)
|
3
|
+
|
4
|
+
## 5.0.2
|
5
|
+
- Add elastic-transport client support used in elasticsearch-ruby 8.x [#223](https://github.com/logstash-plugins/logstash-input-elasticsearch/pull/223)
|
6
|
+
|
7
|
+
## 5.0.1
|
8
|
+
- Fix: prevent plugin crash when hits contain illegal structure [#218](https://github.com/logstash-plugins/logstash-input-elasticsearch/pull/218)
|
9
|
+
- When a hit cannot be converted to an event, the input now emits an event tagged with `_elasticsearch_input_failure` with an `[event][original]` containing a JSON-encoded string representation of the entire hit.
|
10
|
+
|
1
11
|
## 5.0.0
|
2
12
|
- SSL settings that were marked deprecated in version `4.17.0` are now marked obsolete, and will prevent the plugin from starting.
|
3
13
|
- These settings are:
|
@@ -5,6 +15,7 @@
|
|
5
15
|
- `ca_file`, which should bre replaced by `ssl_certificate_authorities`
|
6
16
|
- `ssl_certificate_verification`, which should bre replaced by `ssl_verification_mode`
|
7
17
|
- [#213](https://github.com/logstash-plugins/logstash-input-elasticsearch/pull/213)
|
18
|
+
- Add support for custom headers [#207](https://github.com/logstash-plugins/logstash-input-elasticsearch/pull/207)
|
8
19
|
|
9
20
|
## 4.20.5
|
10
21
|
- Add `x-elastic-product-origin` header to Elasticsearch requests [#211](https://github.com/logstash-plugins/logstash-input-elasticsearch/pull/211)
|
data/docs/index.asciidoc
CHANGED
@@ -48,7 +48,7 @@ This would create an Elasticsearch query with the following format:
|
|
48
48
|
"sort": [ "_doc" ]
|
49
49
|
}'
|
50
50
|
|
51
|
-
|
51
|
+
[id="plugins-{type}s-{plugin}-scheduling"]
|
52
52
|
==== Scheduling
|
53
53
|
|
54
54
|
Input from this plugin can be scheduled to run periodically according to a specific
|
@@ -93,6 +93,143 @@ The plugin logs a warning when ECS is enabled and `target` isn't set.
|
|
93
93
|
|
94
94
|
TIP: Set the `target` option to avoid potential schema conflicts.
|
95
95
|
|
96
|
+
[id="plugins-{type}s-{plugin}-failure-handling"]
|
97
|
+
==== Failure handling
|
98
|
+
|
99
|
+
When this input plugin cannot create a structured `Event` from a hit result, it will instead create an `Event` that is tagged with `_elasticsearch_input_failure` whose `[event][original]` is a JSON-encoded string representation of the entire hit.
|
100
|
+
|
101
|
+
Common causes are:
|
102
|
+
|
103
|
+
- When the hit result contains top-level fields that are {logstash-ref}/processing.html#reserved-fields[reserved in Logstash] but do not have the expected shape. Use the <<plugins-{type}s-{plugin}-target>> directive to avoid conflicts with the top-level namespace.
|
104
|
+
- When <<plugins-{type}s-{plugin}-docinfo>> is enabled and the docinfo fields cannot be merged into the hit result. Combine <<plugins-{type}s-{plugin}-target>> and <<plugins-{type}s-{plugin}-docinfo_target>> to avoid conflict.
|
105
|
+
|
106
|
+
[id="plugins-{type}s-{plugin}-cursor"]
|
107
|
+
==== Tracking a field's value across runs
|
108
|
+
|
109
|
+
.Technical Preview: Tracking a field's value
|
110
|
+
****
|
111
|
+
The feature that allows tracking a field's value across runs is in _Technical Preview_.
|
112
|
+
Configuration options and implementation details are subject to change in minor releases without being preceded by deprecation warnings.
|
113
|
+
****
|
114
|
+
|
115
|
+
Some uses cases require tracking the value of a particular field between two jobs.
|
116
|
+
Examples include:
|
117
|
+
|
118
|
+
* avoiding the need to re-process the entire result set of a long query after an unplanned restart
|
119
|
+
* grabbing only new data from an index instead of processing the entire set on each job.
|
120
|
+
|
121
|
+
The Elasticsearch input plugin provides the <<plugins-{type}s-{plugin}-tracking_field>> and <<plugins-{type}s-{plugin}-tracking_field_seed>> options.
|
122
|
+
When <<plugins-{type}s-{plugin}-tracking_field>> is set, the plugin records the value of that field for the last document retrieved in a run into
|
123
|
+
a file.
|
124
|
+
(The file location defaults to <<plugins-{type}s-{plugin}-last_run_metadata_path>>.)
|
125
|
+
|
126
|
+
You can then inject this value in the query using the placeholder `:last_value`.
|
127
|
+
The value will be injected into the query before execution, and then updated after the query completes if new data was found.
|
128
|
+
|
129
|
+
This feature works best when:
|
130
|
+
|
131
|
+
* the query sorts by the tracking field,
|
132
|
+
* the timestamp field is added by {es}, and
|
133
|
+
* the field type has enough resolution so that two events are unlikely to have the same value.
|
134
|
+
|
135
|
+
Consider using a tracking field whose type is https://www.elastic.co/guide/en/elasticsearch/reference/current/date_nanos.html[date nanoseconds].
|
136
|
+
If the tracking field is of this data type, you can use an extra placeholder called `:present` to inject the nano-second based value of "now-30s".
|
137
|
+
This placeholder is useful as the right-hand side of a range filter, allowing the collection of
|
138
|
+
new data but leaving partially-searchable bulk request data to the next scheduled job.
|
139
|
+
|
140
|
+
[id="plugins-{type}s-{plugin}-tracking-sample"]
|
141
|
+
===== Sample configuration: Track field value across runs
|
142
|
+
|
143
|
+
This section contains a series of steps to help you set up the "tailing" of data being written to a set of indices, using a date nanosecond field added by an Elasticsearch ingest pipeline and the `tracking_field` capability of this plugin.
|
144
|
+
|
145
|
+
. Create ingest pipeline that adds Elasticsearch's `_ingest.timestamp` field to the documents as `event.ingested`:
|
146
|
+
+
|
147
|
+
[source, json]
|
148
|
+
PUT _ingest/pipeline/my-pipeline
|
149
|
+
{
|
150
|
+
"processors": [
|
151
|
+
{
|
152
|
+
"script": {
|
153
|
+
"lang": "painless",
|
154
|
+
"source": "ctx.putIfAbsent(\"event\", [:]); ctx.event.ingested = metadata().now.format(DateTimeFormatter.ISO_INSTANT);"
|
155
|
+
}
|
156
|
+
}
|
157
|
+
]
|
158
|
+
}
|
159
|
+
|
160
|
+
[start=2]
|
161
|
+
. Create an index mapping where the tracking field is of date nanosecond type and invokes the defined pipeline:
|
162
|
+
+
|
163
|
+
[source, json]
|
164
|
+
PUT /_template/my_template
|
165
|
+
{
|
166
|
+
"index_patterns": ["test-*"],
|
167
|
+
"settings": {
|
168
|
+
"index.default_pipeline": "my-pipeline",
|
169
|
+
},
|
170
|
+
"mappings": {
|
171
|
+
"properties": {
|
172
|
+
"event": {
|
173
|
+
"properties": {
|
174
|
+
"ingested": {
|
175
|
+
"type": "date_nanos",
|
176
|
+
"format": "strict_date_optional_time_nanos"
|
177
|
+
}
|
178
|
+
}
|
179
|
+
}
|
180
|
+
}
|
181
|
+
}
|
182
|
+
}
|
183
|
+
|
184
|
+
[start=3]
|
185
|
+
. Define a query that looks at all data of the indices, sorted by the tracking field, and with a range filter since the last value seen until present:
|
186
|
+
+
|
187
|
+
[source,json]
|
188
|
+
{
|
189
|
+
"query": {
|
190
|
+
"range": {
|
191
|
+
"event.ingested": {
|
192
|
+
"gt": ":last_value",
|
193
|
+
"lt": ":present"
|
194
|
+
}
|
195
|
+
}
|
196
|
+
},
|
197
|
+
"sort": [
|
198
|
+
{
|
199
|
+
"event.ingested": {
|
200
|
+
"order": "asc",
|
201
|
+
"format": "strict_date_optional_time_nanos",
|
202
|
+
"numeric_type": "date_nanos"
|
203
|
+
}
|
204
|
+
}
|
205
|
+
]
|
206
|
+
}
|
207
|
+
|
208
|
+
[start=4]
|
209
|
+
. Configure the Elasticsearch input to query the indices with the query defined above, every minute, and track the `event.ingested` field:
|
210
|
+
+
|
211
|
+
[source, ruby]
|
212
|
+
input {
|
213
|
+
elasticsearch {
|
214
|
+
id => tail_test_index
|
215
|
+
hosts => [ 'https://..']
|
216
|
+
api_key => '....'
|
217
|
+
index => 'test-*'
|
218
|
+
query => '{ "query": { "range": { "event.ingested": { "gt": ":last_value", "lt": ":present"}}}, "sort": [ { "event.ingested": {"order": "asc", "format": "strict_date_optional_time_nanos", "numeric_type" : "date_nanos" } } ] }'
|
219
|
+
tracking_field => "[event][ingested]"
|
220
|
+
slices => 5 # optional use of slices to speed data processing, should be equal to or less than number of primary shards
|
221
|
+
schedule => '* * * * *' # every minute
|
222
|
+
schedule_overlap => false # don't accumulate jobs if one takes longer than 1 minute
|
223
|
+
}
|
224
|
+
}
|
225
|
+
|
226
|
+
With this sample setup, new documents are indexed into a `test-*` index.
|
227
|
+
The next scheduled run:
|
228
|
+
|
229
|
+
* selects all new documents since the last observed value of the tracking field,
|
230
|
+
* uses {ref}/point-in-time-api.html#point-in-time-api[Point in time (PIT)] + {ref}/paginate-search-results.html#search-after[Search after] to paginate through all the data, and
|
231
|
+
* updates the value of the field at the end of the pagination.
|
232
|
+
|
96
233
|
[id="plugins-{type}s-{plugin}-options"]
|
97
234
|
==== Elasticsearch Input configuration options
|
98
235
|
|
@@ -101,9 +238,6 @@ This plugin supports these configuration options plus the <<plugins-{type}s-{plu
|
|
101
238
|
NOTE: As of version `5.0.0` of this plugin, a number of previously deprecated settings related to SSL have been removed.
|
102
239
|
Please check out <<plugins-{type}s-{plugin}-obsolete-options>> for details.
|
103
240
|
|
104
|
-
NOTE: As of version `5.0.0` of this plugin, a number of previously deprecated settings related to SSL have been removed.
|
105
|
-
Please check out <<plugins-{type}s-{plugin}-obsolete-options>> for details.
|
106
|
-
|
107
241
|
[cols="<,<,<",options="header",]
|
108
242
|
|=======================================================================
|
109
243
|
|Setting |Input type|Required
|
@@ -119,12 +253,14 @@ Please check out <<plugins-{type}s-{plugin}-obsolete-options>> for details.
|
|
119
253
|
| <<plugins-{type}s-{plugin}-ecs_compatibility>> |<<string,string>>|No
|
120
254
|
| <<plugins-{type}s-{plugin}-hosts>> |<<array,array>>|No
|
121
255
|
| <<plugins-{type}s-{plugin}-index>> |<<string,string>>|No
|
256
|
+
| <<plugins-{type}s-{plugin}-last_run_metadata_path>> |<<string,string>>|No
|
122
257
|
| <<plugins-{type}s-{plugin}-password>> |<<password,password>>|No
|
123
258
|
| <<plugins-{type}s-{plugin}-proxy>> |<<uri,uri>>|No
|
124
259
|
| <<plugins-{type}s-{plugin}-query>> |<<string,string>>|No
|
125
260
|
| <<plugins-{type}s-{plugin}-response_type>> |<<string,string>>, one of `["hits","aggregations"]`|No
|
126
261
|
| <<plugins-{type}s-{plugin}-request_timeout_seconds>> | <<number,number>>|No
|
127
262
|
| <<plugins-{type}s-{plugin}-schedule>> |<<string,string>>|No
|
263
|
+
| <<plugins-{type}s-{plugin}-schedule_overlap>> |<<boolean,boolean>>|No
|
128
264
|
| <<plugins-{type}s-{plugin}-scroll>> |<<string,string>>|No
|
129
265
|
| <<plugins-{type}s-{plugin}-search_api>> |<<string,string>>, one of `["auto", "search_after", "scroll"]`|No
|
130
266
|
| <<plugins-{type}s-{plugin}-size>> |<<number,number>>|No
|
@@ -144,6 +280,8 @@ Please check out <<plugins-{type}s-{plugin}-obsolete-options>> for details.
|
|
144
280
|
| <<plugins-{type}s-{plugin}-ssl_verification_mode>> |<<string,string>>, one of `["full", "none"]`|No
|
145
281
|
| <<plugins-{type}s-{plugin}-socket_timeout_seconds>> | <<number,number>>|No
|
146
282
|
| <<plugins-{type}s-{plugin}-target>> | {logstash-ref}/field-references-deepdive.html[field reference] | No
|
283
|
+
| <<plugins-{type}s-{plugin}-tracking_field>> |<<string,string>>|No
|
284
|
+
| <<plugins-{type}s-{plugin}-tracking_field_seed>> |<<string,string>>|No
|
147
285
|
| <<plugins-{type}s-{plugin}-retries>> | <<number,number>>|No
|
148
286
|
| <<plugins-{type}s-{plugin}-user>> |<<string,string>>|No
|
149
287
|
|=======================================================================
|
@@ -323,6 +461,17 @@ Check out {ref}/api-conventions.html#api-multi-index[Multi Indices
|
|
323
461
|
documentation] in the Elasticsearch documentation for info on
|
324
462
|
referencing multiple indices.
|
325
463
|
|
464
|
+
[id="plugins-{type}s-{plugin}-last_run_metadata_path"]
|
465
|
+
===== `last_run_metadata_path`
|
466
|
+
|
467
|
+
* Value type is <<string,string>>
|
468
|
+
* There is no default value for this setting.
|
469
|
+
|
470
|
+
The path to store the last observed value of the tracking field, when used.
|
471
|
+
By default this file is stored as `<path.data>/plugins/inputs/elasticsearch/<pipeline_id>/last_run_value`.
|
472
|
+
|
473
|
+
This setting should point to file, not a directory, and Logstash must have read+write access to this file.
|
474
|
+
|
326
475
|
[id="plugins-{type}s-{plugin}-password"]
|
327
476
|
===== `password`
|
328
477
|
|
@@ -403,6 +552,19 @@ for example: "* * * * *" (execute query every minute, on the minute)
|
|
403
552
|
There is no schedule by default. If no schedule is given, then the statement is run
|
404
553
|
exactly once.
|
405
554
|
|
555
|
+
[id="plugins-{type}s-{plugin}-schedule_overlap"]
|
556
|
+
===== `schedule_overlap`
|
557
|
+
|
558
|
+
* Value type is <<boolean,boolean>>
|
559
|
+
* Default value is `true`
|
560
|
+
|
561
|
+
Whether to allow queuing of a scheduled run if a run is occurring.
|
562
|
+
While this is ideal for ensuring a new run happens immediately after the previous on finishes if there
|
563
|
+
is a lot of work to do, but given the queue is unbounded it may lead to an out of memory over long periods of time
|
564
|
+
if the queue grows continuously.
|
565
|
+
|
566
|
+
When in doubt, set `schedule_overlap` to false (it may become the default value in the future).
|
567
|
+
|
406
568
|
[id="plugins-{type}s-{plugin}-scroll"]
|
407
569
|
===== `scroll`
|
408
570
|
|
@@ -615,6 +777,28 @@ When the `target` is set to a field reference, the `_source` of the hit is place
|
|
615
777
|
This option can be useful to avoid populating unknown fields when a downstream schema such as ECS is enforced.
|
616
778
|
It is also possible to target an entry in the event's metadata, which will be available during event processing but not exported to your outputs (e.g., `target \=> "[@metadata][_source]"`).
|
617
779
|
|
780
|
+
[id="plugins-{type}s-{plugin}-tracking_field"]
|
781
|
+
===== `tracking_field`
|
782
|
+
|
783
|
+
* Value type is <<string,string>>
|
784
|
+
* There is no default value for this setting.
|
785
|
+
|
786
|
+
Which field from the last event of a previous run will be used a cursor value for the following run.
|
787
|
+
The value of this field is injected into each query if the query uses the placeholder `:last_value`.
|
788
|
+
For the first query after a pipeline is started, the value used is either read from <<plugins-{type}s-{plugin}-last_run_metadata_path>> file,
|
789
|
+
or taken from <<plugins-{type}s-{plugin}-tracking_field_seed>> setting.
|
790
|
+
|
791
|
+
Note: The tracking value is updated after each page is read and at the end of each Point in Time. In case of a crash the last saved value will be used so some duplication of data can occur. For this reason the use of unique document IDs for each event is recommended in the downstream destination.
|
792
|
+
|
793
|
+
[id="plugins-{type}s-{plugin}-tracking_field_seed"]
|
794
|
+
===== `tracking_field_seed`
|
795
|
+
|
796
|
+
* Value type is <<string,string>>
|
797
|
+
* Default value is `"1970-01-01T00:00:00.000000000Z"`
|
798
|
+
|
799
|
+
The starting value for the <<plugins-{type}s-{plugin}-tracking_field>> if there is no <<plugins-{type}s-{plugin}-last_run_metadata_path>> already.
|
800
|
+
This field defaults to the nanosecond precision ISO8601 representation of `epoch`, or "1970-01-01T00:00:00.000000000Z", given nano-second precision timestamps are the
|
801
|
+
most reliable data format to use for this feature.
|
618
802
|
|
619
803
|
[id="plugins-{type}s-{plugin}-user"]
|
620
804
|
===== `user`
|
@@ -12,14 +12,9 @@ module LogStash
|
|
12
12
|
@client = client
|
13
13
|
@plugin_params = plugin.params
|
14
14
|
|
15
|
+
@index = @plugin_params["index"]
|
15
16
|
@size = @plugin_params["size"]
|
16
|
-
@query = @plugin_params["query"]
|
17
17
|
@retries = @plugin_params["retries"]
|
18
|
-
@agg_options = {
|
19
|
-
:index => @plugin_params["index"],
|
20
|
-
:size => 0
|
21
|
-
}.merge(:body => @query)
|
22
|
-
|
23
18
|
@plugin = plugin
|
24
19
|
end
|
25
20
|
|
@@ -33,10 +28,18 @@ module LogStash
|
|
33
28
|
false
|
34
29
|
end
|
35
30
|
|
36
|
-
def
|
31
|
+
def aggregation_options(query_object)
|
32
|
+
{
|
33
|
+
:index => @index,
|
34
|
+
:size => 0,
|
35
|
+
:body => query_object
|
36
|
+
}
|
37
|
+
end
|
38
|
+
|
39
|
+
def do_run(output_queue, query_object)
|
37
40
|
logger.info("Aggregation starting")
|
38
41
|
r = retryable(AGGREGATION_JOB) do
|
39
|
-
@client.search(
|
42
|
+
@client.search(aggregation_options(query_object))
|
40
43
|
end
|
41
44
|
@plugin.push_hit(r, output_queue, 'aggregations') if r
|
42
45
|
end
|
@@ -0,0 +1,58 @@
|
|
1
|
+
require 'fileutils'
|
2
|
+
|
3
|
+
module LogStash; module Inputs; class Elasticsearch
|
4
|
+
class CursorTracker
|
5
|
+
include LogStash::Util::Loggable
|
6
|
+
|
7
|
+
attr_reader :last_value
|
8
|
+
|
9
|
+
def initialize(last_run_metadata_path:, tracking_field:, tracking_field_seed:)
|
10
|
+
@last_run_metadata_path = last_run_metadata_path
|
11
|
+
@last_value_hashmap = Java::java.util.concurrent.ConcurrentHashMap.new
|
12
|
+
@last_value = IO.read(@last_run_metadata_path) rescue nil || tracking_field_seed
|
13
|
+
@tracking_field = tracking_field
|
14
|
+
logger.info "Starting value for cursor field \"#{@tracking_field}\": #{@last_value}"
|
15
|
+
@mutex = Mutex.new
|
16
|
+
end
|
17
|
+
|
18
|
+
def checkpoint_cursor(intermediate: true)
|
19
|
+
@mutex.synchronize do
|
20
|
+
if intermediate
|
21
|
+
# in intermediate checkpoints pick the smallest
|
22
|
+
converge_last_value {|v1, v2| v1 < v2 ? v1 : v2}
|
23
|
+
else
|
24
|
+
# in the last search of a PIT choose the largest
|
25
|
+
converge_last_value {|v1, v2| v1 > v2 ? v1 : v2}
|
26
|
+
@last_value_hashmap.clear
|
27
|
+
end
|
28
|
+
IO.write(@last_run_metadata_path, @last_value)
|
29
|
+
end
|
30
|
+
end
|
31
|
+
|
32
|
+
def converge_last_value(&block)
|
33
|
+
return if @last_value_hashmap.empty?
|
34
|
+
new_last_value = @last_value_hashmap.reduceValues(1000, &block)
|
35
|
+
logger.debug? && logger.debug("converge_last_value: got #{@last_value_hashmap.values.inspect}. won: #{new_last_value}")
|
36
|
+
return if new_last_value == @last_value
|
37
|
+
@last_value = new_last_value
|
38
|
+
logger.info "New cursor value for field \"#{@tracking_field}\" is: #{new_last_value}"
|
39
|
+
end
|
40
|
+
|
41
|
+
def record_last_value(event)
|
42
|
+
value = event.get(@tracking_field)
|
43
|
+
logger.trace? && logger.trace("storing last_value if #{@tracking_field} for #{Thread.current.object_id}: #{value}")
|
44
|
+
@last_value_hashmap.put(Thread.current.object_id, value)
|
45
|
+
end
|
46
|
+
|
47
|
+
def inject_cursor(query_json)
|
48
|
+
# ":present" means "now - 30s" to avoid grabbing partially visible data in the PIT
|
49
|
+
result = query_json.gsub(":last_value", @last_value.to_s).gsub(":present", now_minus_30s)
|
50
|
+
logger.debug("inject_cursor: injected values for ':last_value' and ':present'", :query => result)
|
51
|
+
result
|
52
|
+
end
|
53
|
+
|
54
|
+
def now_minus_30s
|
55
|
+
Java::java.time.Instant.now.minusSeconds(30).to_s
|
56
|
+
end
|
57
|
+
end
|
58
|
+
end; end; end
|
@@ -21,9 +21,10 @@ module LogStash
|
|
21
21
|
@pipeline_id = plugin.pipeline_id
|
22
22
|
end
|
23
23
|
|
24
|
-
def do_run(output_queue)
|
25
|
-
|
24
|
+
def do_run(output_queue, query)
|
25
|
+
@query = query
|
26
26
|
|
27
|
+
return retryable_search(output_queue) if @slices.nil? || @slices <= 1
|
27
28
|
retryable_slice_search(output_queue)
|
28
29
|
end
|
29
30
|
|
@@ -122,6 +123,13 @@ module LogStash
|
|
122
123
|
PIT_JOB = "create point in time (PIT)"
|
123
124
|
SEARCH_AFTER_JOB = "search_after paginated search"
|
124
125
|
|
126
|
+
attr_accessor :cursor_tracker
|
127
|
+
|
128
|
+
def do_run(output_queue, query)
|
129
|
+
super(output_queue, query)
|
130
|
+
@cursor_tracker.checkpoint_cursor(intermediate: false) if @cursor_tracker
|
131
|
+
end
|
132
|
+
|
125
133
|
def pit?(id)
|
126
134
|
!!id&.is_a?(String)
|
127
135
|
end
|
@@ -192,6 +200,8 @@ module LogStash
|
|
192
200
|
end
|
193
201
|
end
|
194
202
|
|
203
|
+
@cursor_tracker.checkpoint_cursor(intermediate: true) if @cursor_tracker
|
204
|
+
|
195
205
|
logger.info("Query completed", log_details)
|
196
206
|
end
|
197
207
|
|
@@ -13,9 +13,7 @@ require "logstash/plugin_mixins/normalize_config_support"
|
|
13
13
|
require "base64"
|
14
14
|
|
15
15
|
require "elasticsearch"
|
16
|
-
require "
|
17
|
-
require_relative "elasticsearch/patches/_elasticsearch_transport_http_manticore"
|
18
|
-
require_relative "elasticsearch/patches/_elasticsearch_transport_connections_selector"
|
16
|
+
require "manticore"
|
19
17
|
|
20
18
|
# .Compatibility Note
|
21
19
|
# [NOTE]
|
@@ -75,6 +73,7 @@ class LogStash::Inputs::Elasticsearch < LogStash::Inputs::Base
|
|
75
73
|
|
76
74
|
require 'logstash/inputs/elasticsearch/paginated_search'
|
77
75
|
require 'logstash/inputs/elasticsearch/aggregation'
|
76
|
+
require 'logstash/inputs/elasticsearch/cursor_tracker'
|
78
77
|
|
79
78
|
include LogStash::PluginMixins::ECSCompatibilitySupport(:disabled, :v1, :v8 => :v1)
|
80
79
|
include LogStash::PluginMixins::ECSCompatibilitySupport::TargetCheck
|
@@ -126,6 +125,20 @@ class LogStash::Inputs::Elasticsearch < LogStash::Inputs::Base
|
|
126
125
|
# by this pipeline input.
|
127
126
|
config :slices, :validate => :number
|
128
127
|
|
128
|
+
# Enable tracking the value of a given field to be used as a cursor
|
129
|
+
# Main concerns:
|
130
|
+
# * using anything other than _event.timestamp easily leads to data loss
|
131
|
+
# * the first "synchronization run can take a long time"
|
132
|
+
config :tracking_field, :validate => :string
|
133
|
+
|
134
|
+
# Define the initial seed value of the tracking_field
|
135
|
+
config :tracking_field_seed, :validate => :string, :default => "1970-01-01T00:00:00.000000000Z"
|
136
|
+
|
137
|
+
# The location of where the tracking field value will be stored
|
138
|
+
# The value is persisted after each scheduled run (and not per result)
|
139
|
+
# If it's not set it defaults to '${path.data}/plugins/inputs/elasticsearch/<pipeline_id>/last_run_value'
|
140
|
+
config :last_run_metadata_path, :validate => :string
|
141
|
+
|
129
142
|
# If set, include Elasticsearch document information such as index, type, and
|
130
143
|
# the id in the event.
|
131
144
|
#
|
@@ -252,6 +265,10 @@ class LogStash::Inputs::Elasticsearch < LogStash::Inputs::Base
|
|
252
265
|
# exactly once.
|
253
266
|
config :schedule, :validate => :string
|
254
267
|
|
268
|
+
# Allow scheduled runs to overlap (enabled by default). Setting to false will
|
269
|
+
# only start a new scheduled run after the previous one completes.
|
270
|
+
config :schedule_overlap, :validate => :boolean
|
271
|
+
|
255
272
|
# If set, the _source of each hit will be added nested under the target instead of at the top-level
|
256
273
|
config :target, :validate => :field_reference
|
257
274
|
|
@@ -316,7 +333,7 @@ class LogStash::Inputs::Elasticsearch < LogStash::Inputs::Base
|
|
316
333
|
@client_options = {
|
317
334
|
:hosts => hosts,
|
318
335
|
:transport_options => transport_options,
|
319
|
-
:transport_class =>
|
336
|
+
:transport_class => get_transport_client_class,
|
320
337
|
:ssl => ssl_options
|
321
338
|
}
|
322
339
|
|
@@ -330,26 +347,55 @@ class LogStash::Inputs::Elasticsearch < LogStash::Inputs::Base
|
|
330
347
|
|
331
348
|
setup_query_executor
|
332
349
|
|
350
|
+
setup_cursor_tracker
|
351
|
+
|
333
352
|
@client
|
334
353
|
end
|
335
354
|
|
336
355
|
def run(output_queue)
|
337
356
|
if @schedule
|
338
|
-
scheduler.cron(@schedule
|
357
|
+
scheduler.cron(@schedule, :overlap => @schedule_overlap) do
|
358
|
+
@query_executor.do_run(output_queue, get_query_object())
|
359
|
+
end
|
339
360
|
scheduler.join
|
340
361
|
else
|
341
|
-
@query_executor.do_run(output_queue)
|
362
|
+
@query_executor.do_run(output_queue, get_query_object())
|
363
|
+
end
|
364
|
+
end
|
365
|
+
|
366
|
+
def get_query_object
|
367
|
+
if @cursor_tracker
|
368
|
+
query = @cursor_tracker.inject_cursor(@query)
|
369
|
+
@logger.debug("new query is #{query}")
|
370
|
+
else
|
371
|
+
query = @query
|
342
372
|
end
|
373
|
+
LogStash::Json.load(query)
|
343
374
|
end
|
344
375
|
|
345
376
|
##
|
346
377
|
# This can be called externally from the query_executor
|
347
378
|
public
|
348
379
|
def push_hit(hit, output_queue, root_field = '_source')
|
349
|
-
event =
|
350
|
-
set_docinfo_fields(hit, event) if @docinfo
|
380
|
+
event = event_from_hit(hit, root_field)
|
351
381
|
decorate(event)
|
352
382
|
output_queue << event
|
383
|
+
record_last_value(event)
|
384
|
+
end
|
385
|
+
|
386
|
+
def record_last_value(event)
|
387
|
+
@cursor_tracker.record_last_value(event) if @tracking_field
|
388
|
+
end
|
389
|
+
|
390
|
+
def event_from_hit(hit, root_field)
|
391
|
+
event = targeted_event_factory.new_event hit[root_field]
|
392
|
+
set_docinfo_fields(hit, event) if @docinfo
|
393
|
+
|
394
|
+
event
|
395
|
+
rescue => e
|
396
|
+
serialized_hit = hit.to_json
|
397
|
+
logger.warn("Event creation error, original data now in [event][original] field", message: e.message, exception: e.class, data: serialized_hit)
|
398
|
+
return event_factory.new_event('event' => { 'original' => serialized_hit }, 'tags' => ['_elasticsearch_input_failure'])
|
353
399
|
end
|
354
400
|
|
355
401
|
def set_docinfo_fields(hit, event)
|
@@ -357,10 +403,8 @@ class LogStash::Inputs::Elasticsearch < LogStash::Inputs::Base
|
|
357
403
|
docinfo_target = event.get(@docinfo_target) || {}
|
358
404
|
|
359
405
|
unless docinfo_target.is_a?(Hash)
|
360
|
-
|
361
|
-
|
362
|
-
# TODO: (colin) I am not sure raising is a good strategy here?
|
363
|
-
raise Exception.new("Elasticsearch input: incompatible event")
|
406
|
+
# expect error to be handled by `#event_from_hit`
|
407
|
+
fail RuntimeError, "Incompatible event; unable to merge docinfo fields into docinfo_target=`#{@docinfo_target}`"
|
364
408
|
end
|
365
409
|
|
366
410
|
@docinfo_fields.each do |field|
|
@@ -634,6 +678,42 @@ class LogStash::Inputs::Elasticsearch < LogStash::Inputs::Base
|
|
634
678
|
end
|
635
679
|
end
|
636
680
|
|
681
|
+
def setup_cursor_tracker
|
682
|
+
return unless @tracking_field
|
683
|
+
return unless @query_executor.is_a?(LogStash::Inputs::Elasticsearch::SearchAfter)
|
684
|
+
|
685
|
+
if @resolved_search_api != "search_after" || @response_type != "hits"
|
686
|
+
raise ConfigurationError.new("The `tracking_field` feature can only be used with `search_after` non-aggregation queries")
|
687
|
+
end
|
688
|
+
|
689
|
+
@cursor_tracker = CursorTracker.new(last_run_metadata_path: last_run_metadata_path,
|
690
|
+
tracking_field: @tracking_field,
|
691
|
+
tracking_field_seed: @tracking_field_seed)
|
692
|
+
@query_executor.cursor_tracker = @cursor_tracker
|
693
|
+
end
|
694
|
+
|
695
|
+
def last_run_metadata_path
|
696
|
+
return @last_run_metadata_path if @last_run_metadata_path
|
697
|
+
|
698
|
+
last_run_metadata_path = ::File.join(LogStash::SETTINGS.get_value("path.data"), "plugins", "inputs", "elasticsearch", pipeline_id, "last_run_value")
|
699
|
+
FileUtils.mkdir_p ::File.dirname(last_run_metadata_path)
|
700
|
+
last_run_metadata_path
|
701
|
+
end
|
702
|
+
|
703
|
+
def get_transport_client_class
|
704
|
+
# LS-core includes `elasticsearch` gem. The gem is composed of two separate gems: `elasticsearch-api` and `elasticsearch-transport`
|
705
|
+
# And now `elasticsearch-transport` is old, instead we have `elastic-transport`.
|
706
|
+
# LS-core updated `elasticsearch` > 8: https://github.com/elastic/logstash/pull/17161
|
707
|
+
# Following source bits are for the compatibility to support both `elasticsearch-transport` and `elastic-transport` gems
|
708
|
+
require "elasticsearch/transport/transport/http/manticore"
|
709
|
+
require_relative "elasticsearch/patches/_elasticsearch_transport_http_manticore"
|
710
|
+
require_relative "elasticsearch/patches/_elasticsearch_transport_connections_selector"
|
711
|
+
::Elasticsearch::Transport::Transport::HTTP::Manticore
|
712
|
+
rescue ::LoadError
|
713
|
+
require "elastic/transport/transport/http/manticore"
|
714
|
+
::Elastic::Transport::Transport::HTTP::Manticore
|
715
|
+
end
|
716
|
+
|
637
717
|
module URIOrEmptyValidator
|
638
718
|
##
|
639
719
|
# @override to provide :uri_or_empty validator
|
@@ -1,13 +1,13 @@
|
|
1
1
|
Gem::Specification.new do |s|
|
2
2
|
|
3
3
|
s.name = 'logstash-input-elasticsearch'
|
4
|
-
s.version = '5.
|
4
|
+
s.version = '5.1.0'
|
5
5
|
s.licenses = ['Apache License (2.0)']
|
6
6
|
s.summary = "Reads query results from an Elasticsearch cluster"
|
7
7
|
s.description = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program"
|
8
8
|
s.authors = ["Elastic"]
|
9
9
|
s.email = 'info@elastic.co'
|
10
|
-
s.homepage = "
|
10
|
+
s.homepage = "https://elastic.co/logstash"
|
11
11
|
s.require_paths = ["lib"]
|
12
12
|
|
13
13
|
# Files
|
@@ -26,7 +26,7 @@ Gem::Specification.new do |s|
|
|
26
26
|
s.add_runtime_dependency "logstash-mixin-validator_support", '~> 1.0'
|
27
27
|
s.add_runtime_dependency "logstash-mixin-scheduler", '~> 1.0'
|
28
28
|
|
29
|
-
s.add_runtime_dependency 'elasticsearch', '>= 7.17.9'
|
29
|
+
s.add_runtime_dependency 'elasticsearch', '>= 7.17.9', '< 9'
|
30
30
|
s.add_runtime_dependency 'logstash-mixin-ca_trusted_fingerprint_support', '~> 1.0'
|
31
31
|
s.add_runtime_dependency 'logstash-mixin-normalize_config_support', '~>1.0'
|
32
32
|
|
@@ -0,0 +1 @@
|
|
1
|
+
2024-12-26T22:27:15+00:00
|
@@ -1,20 +1,19 @@
|
|
1
1
|
-----BEGIN CERTIFICATE-----
|
2
|
-
|
3
|
-
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
emHprBii/5y1HieKXlX9CZRb5qEPHckDVXW3znw=
|
2
|
+
MIIDFTCCAf2gAwIBAgIBATANBgkqhkiG9w0BAQsFADA0MTIwMAYDVQQDEylFbGFz
|
3
|
+
dGljIENlcnRpZmljYXRlIFRvb2wgQXV0b2dlbmVyYXRlZCBDQTAeFw0yNDEyMjYy
|
4
|
+
MjI3MTVaFw0yNTEyMjYyMjI3MTVaMDQxMjAwBgNVBAMTKUVsYXN0aWMgQ2VydGlm
|
5
|
+
aWNhdGUgVG9vbCBBdXRvZ2VuZXJhdGVkIENBMIIBIjANBgkqhkiG9w0BAQEFAAOC
|
6
|
+
AQ8AMIIBCgKCAQEArUe66xG4Y2zO13gRC+rBwyvxe+c01pqV6ukw6isIbJIQWs1/
|
7
|
+
QfEMhUwYwKs6/UXxK+VwardcA2zYwngXbGGEtms+mpUfH5CdJnrqW7lHz1BVK4yH
|
8
|
+
90IzGE0GU4D90OW/L4QkGX0fv3VQbL8KGFKBoF04pXIaSGMStFN4wirutHtQboYv
|
9
|
+
99X4kbLjVSIuubUpA/v9dUP1TNl8ar+HKUWRM96ijHkFTF3FR0NnZyt44gP5qC0h
|
10
|
+
i4lUiR6Uo9D6WMFjeRYFF7GolCy/I1SzWBmmOnNhQLO5VxcNG4ldhBcapZeGwE98
|
11
|
+
m/5lxLIwgFR9ZP8bXdxZTWLC58/LQ2NqOjA9mwIDAQABozIwMDAPBgNVHRMBAf8E
|
12
|
+
BTADAQH/MB0GA1UdDgQWBBTIJMnuftpfkxNCOkbF0R4xgcKQRjANBgkqhkiG9w0B
|
13
|
+
AQsFAAOCAQEAhfg/cmXc4Uh90yiXU8jOW8saQjTsq4ZMDQiLfJsNmNNYmHFN0vhv
|
14
|
+
lJRI1STdy7+GpjS5QbrMjQIxWSS8X8xysE4Rt81IrWmLuao35TRFyoiE1seBQ5sz
|
15
|
+
p/BxZUe57JvWi9dyzv2df4UfWFdGBhzdr80odZmz4i5VIv6qCKJKsGikcuLpepmp
|
16
|
+
E/UKnKHeR/dFWsxzA9P2OzHTUNBMOOA2PyAUL49pwoChwJeOWN/zAgwMWLbuHFG0
|
17
|
+
IN0u8swAmeH98QdvzbhiOatGNpqfTNvQEDc19yVjfXKpBVZQ79WtronYSqrbrUa1
|
18
|
+
T2zD8bIVP7CdddD/UmpT1SSKh4PJxudy5Q==
|
20
19
|
-----END CERTIFICATE-----
|
@@ -1 +1 @@
|
|
1
|
-
|
1
|
+
b1e955819b0d14f64f863adb103c248ddacf2e17bea48d04ee4b57c64814ccc4
|
@@ -0,0 +1,38 @@
|
|
1
|
+
-----BEGIN CERTIFICATE-----
|
2
|
+
MIIDIzCCAgugAwIBAgIBATANBgkqhkiG9w0BAQsFADA0MTIwMAYDVQQDEylFbGFz
|
3
|
+
dGljIENlcnRpZmljYXRlIFRvb2wgQXV0b2dlbmVyYXRlZCBDQTAeFw0yNDEyMjYy
|
4
|
+
MjI3MTVaFw0yNTEyMjYyMjI3MTVaMA0xCzAJBgNVBAMTAmVzMIIBIjANBgkqhkiG
|
5
|
+
9w0BAQEFAAOCAQ8AMIIBCgKCAQEArZLZvLSWDK7Ul+AaBnjU81dsfaow8zOjCC5V
|
6
|
+
V21nXpYzQJoQbuWcvGYxwL7ZDs2ca4Wc8BVCj1NDduHuP7U+QIlUdQpl8kh5a0Zz
|
7
|
+
36pcFw7UyF51/AzWixJrht/Azzkb5cpZtE22ZK0KhS4oCsjJmTN0EABAsGhDI9/c
|
8
|
+
MjNrUC7iP0dvfOuzAPp7ufY83h98jKKXUYV24snbbvmqoWI6GQQNSG/sEo1+1UGH
|
9
|
+
/z07/mVKoBAa5DVoNGvxN0fCE7vW7hkhT8+frJcsYFatAbnf6ql0KzEa8lN9u0gR
|
10
|
+
hQNM3zcKKsjEMomBzVBc4SV3KXO0d/jGdDtlqsm2oXqlTMdtGwIDAQABo2cwZTAY
|
11
|
+
BgNVHREEETAPgg1lbGFzdGljc2VhcmNoMAkGA1UdEwQCMAAwHQYDVR0OBBYEFFQU
|
12
|
+
K+6Cg2kExRj1xSDzEi4kkgKXMB8GA1UdIwQYMBaAFMgkye5+2l+TE0I6RsXRHjGB
|
13
|
+
wpBGMA0GCSqGSIb3DQEBCwUAA4IBAQB6cZ7IrDzcAoOZgAt9RlOe2yzQeH+alttp
|
14
|
+
CSQVINjJotS1WvmtqjBB6ArqLpXIGU89TZsktNe/NQJzgYSaMnlIuHVLFdxJYmwU
|
15
|
+
T1cP6VC/brmqP/dd5y7VWE7Lp+Wd5CxKl/WY+9chmgc+a1fW/lnPEJJ6pca1Bo8b
|
16
|
+
byIL0yY2IUv4R2eh1IyQl9oGH1GOPLgO7cY04eajxYcOVA2eDSItoyDtrJfkFP/P
|
17
|
+
UXtC1JAkvWKuujFEiBj0AannhroWlp3gvChhBwCuCAU0KXD6g8BE8tn6oT1+FW7J
|
18
|
+
avSfHxAe+VHtYhF8sJ8jrdm0d7E4GKS9UR/pkLAL1JuRdJ1VkPx3
|
19
|
+
-----END CERTIFICATE-----
|
20
|
+
-----BEGIN CERTIFICATE-----
|
21
|
+
MIIDFTCCAf2gAwIBAgIBATANBgkqhkiG9w0BAQsFADA0MTIwMAYDVQQDEylFbGFz
|
22
|
+
dGljIENlcnRpZmljYXRlIFRvb2wgQXV0b2dlbmVyYXRlZCBDQTAeFw0yNDEyMjYy
|
23
|
+
MjI3MTVaFw0yNTEyMjYyMjI3MTVaMDQxMjAwBgNVBAMTKUVsYXN0aWMgQ2VydGlm
|
24
|
+
aWNhdGUgVG9vbCBBdXRvZ2VuZXJhdGVkIENBMIIBIjANBgkqhkiG9w0BAQEFAAOC
|
25
|
+
AQ8AMIIBCgKCAQEArUe66xG4Y2zO13gRC+rBwyvxe+c01pqV6ukw6isIbJIQWs1/
|
26
|
+
QfEMhUwYwKs6/UXxK+VwardcA2zYwngXbGGEtms+mpUfH5CdJnrqW7lHz1BVK4yH
|
27
|
+
90IzGE0GU4D90OW/L4QkGX0fv3VQbL8KGFKBoF04pXIaSGMStFN4wirutHtQboYv
|
28
|
+
99X4kbLjVSIuubUpA/v9dUP1TNl8ar+HKUWRM96ijHkFTF3FR0NnZyt44gP5qC0h
|
29
|
+
i4lUiR6Uo9D6WMFjeRYFF7GolCy/I1SzWBmmOnNhQLO5VxcNG4ldhBcapZeGwE98
|
30
|
+
m/5lxLIwgFR9ZP8bXdxZTWLC58/LQ2NqOjA9mwIDAQABozIwMDAPBgNVHRMBAf8E
|
31
|
+
BTADAQH/MB0GA1UdDgQWBBTIJMnuftpfkxNCOkbF0R4xgcKQRjANBgkqhkiG9w0B
|
32
|
+
AQsFAAOCAQEAhfg/cmXc4Uh90yiXU8jOW8saQjTsq4ZMDQiLfJsNmNNYmHFN0vhv
|
33
|
+
lJRI1STdy7+GpjS5QbrMjQIxWSS8X8xysE4Rt81IrWmLuao35TRFyoiE1seBQ5sz
|
34
|
+
p/BxZUe57JvWi9dyzv2df4UfWFdGBhzdr80odZmz4i5VIv6qCKJKsGikcuLpepmp
|
35
|
+
E/UKnKHeR/dFWsxzA9P2OzHTUNBMOOA2PyAUL49pwoChwJeOWN/zAgwMWLbuHFG0
|
36
|
+
IN0u8swAmeH98QdvzbhiOatGNpqfTNvQEDc19yVjfXKpBVZQ79WtronYSqrbrUa1
|
37
|
+
T2zD8bIVP7CdddD/UmpT1SSKh4PJxudy5Q==
|
38
|
+
-----END CERTIFICATE-----
|
@@ -1,20 +1,19 @@
|
|
1
1
|
-----BEGIN CERTIFICATE-----
|
2
|
-
|
3
|
-
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
qi02i4q6meHGcw==
|
2
|
+
MIIDIzCCAgugAwIBAgIBATANBgkqhkiG9w0BAQsFADA0MTIwMAYDVQQDEylFbGFz
|
3
|
+
dGljIENlcnRpZmljYXRlIFRvb2wgQXV0b2dlbmVyYXRlZCBDQTAeFw0yNDEyMjYy
|
4
|
+
MjI3MTVaFw0yNTEyMjYyMjI3MTVaMA0xCzAJBgNVBAMTAmVzMIIBIjANBgkqhkiG
|
5
|
+
9w0BAQEFAAOCAQ8AMIIBCgKCAQEArZLZvLSWDK7Ul+AaBnjU81dsfaow8zOjCC5V
|
6
|
+
V21nXpYzQJoQbuWcvGYxwL7ZDs2ca4Wc8BVCj1NDduHuP7U+QIlUdQpl8kh5a0Zz
|
7
|
+
36pcFw7UyF51/AzWixJrht/Azzkb5cpZtE22ZK0KhS4oCsjJmTN0EABAsGhDI9/c
|
8
|
+
MjNrUC7iP0dvfOuzAPp7ufY83h98jKKXUYV24snbbvmqoWI6GQQNSG/sEo1+1UGH
|
9
|
+
/z07/mVKoBAa5DVoNGvxN0fCE7vW7hkhT8+frJcsYFatAbnf6ql0KzEa8lN9u0gR
|
10
|
+
hQNM3zcKKsjEMomBzVBc4SV3KXO0d/jGdDtlqsm2oXqlTMdtGwIDAQABo2cwZTAY
|
11
|
+
BgNVHREEETAPgg1lbGFzdGljc2VhcmNoMAkGA1UdEwQCMAAwHQYDVR0OBBYEFFQU
|
12
|
+
K+6Cg2kExRj1xSDzEi4kkgKXMB8GA1UdIwQYMBaAFMgkye5+2l+TE0I6RsXRHjGB
|
13
|
+
wpBGMA0GCSqGSIb3DQEBCwUAA4IBAQB6cZ7IrDzcAoOZgAt9RlOe2yzQeH+alttp
|
14
|
+
CSQVINjJotS1WvmtqjBB6ArqLpXIGU89TZsktNe/NQJzgYSaMnlIuHVLFdxJYmwU
|
15
|
+
T1cP6VC/brmqP/dd5y7VWE7Lp+Wd5CxKl/WY+9chmgc+a1fW/lnPEJJ6pca1Bo8b
|
16
|
+
byIL0yY2IUv4R2eh1IyQl9oGH1GOPLgO7cY04eajxYcOVA2eDSItoyDtrJfkFP/P
|
17
|
+
UXtC1JAkvWKuujFEiBj0AannhroWlp3gvChhBwCuCAU0KXD6g8BE8tn6oT1+FW7J
|
18
|
+
avSfHxAe+VHtYhF8sJ8jrdm0d7E4GKS9UR/pkLAL1JuRdJ1VkPx3
|
20
19
|
-----END CERTIFICATE-----
|
@@ -0,0 +1,15 @@
|
|
1
|
+
#!/usr/bin/env bash
|
2
|
+
|
3
|
+
set -e
|
4
|
+
cd "$(dirname "$0")"
|
5
|
+
|
6
|
+
openssl x509 -x509toreq -in ca.crt -copy_extensions copyall -signkey ca.key -out ca.csr
|
7
|
+
openssl x509 -req -copy_extensions copyall -days 365 -in ca.csr -set_serial 0x01 -signkey ca.key -out ca.crt && rm ca.csr
|
8
|
+
openssl x509 -in ca.crt -outform der | sha256sum | awk '{print $1}' > ca.der.sha256
|
9
|
+
|
10
|
+
openssl x509 -x509toreq -in es.crt -copy_extensions copyall -signkey es.key -out es.csr
|
11
|
+
openssl x509 -req -copy_extensions copyall -days 365 -in es.csr -set_serial 0x01 -CA ca.crt -CAkey ca.key -out es.crt && rm es.csr
|
12
|
+
cat es.crt ca.crt > es.chain.crt
|
13
|
+
|
14
|
+
# output ISO8601 timestamp to file
|
15
|
+
date -Iseconds > GENERATED_AT
|
@@ -0,0 +1,72 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
require "logstash/devutils/rspec/spec_helper"
|
3
|
+
require "logstash/devutils/rspec/shared_examples"
|
4
|
+
require "logstash/inputs/elasticsearch"
|
5
|
+
require "logstash/inputs/elasticsearch/cursor_tracker"
|
6
|
+
|
7
|
+
describe LogStash::Inputs::Elasticsearch::CursorTracker do
|
8
|
+
|
9
|
+
let(:last_run_metadata_path) { Tempfile.new('cursor_tracker_testing').path }
|
10
|
+
let(:tracking_field_seed) { "1980-01-01T23:59:59.999999999Z" }
|
11
|
+
let(:options) do
|
12
|
+
{
|
13
|
+
:last_run_metadata_path => last_run_metadata_path,
|
14
|
+
:tracking_field => "my_field",
|
15
|
+
:tracking_field_seed => tracking_field_seed
|
16
|
+
}
|
17
|
+
end
|
18
|
+
|
19
|
+
subject { described_class.new(**options) }
|
20
|
+
|
21
|
+
it "creating a class works" do
|
22
|
+
expect(subject).to be_a described_class
|
23
|
+
end
|
24
|
+
|
25
|
+
describe "checkpoint_cursor" do
|
26
|
+
before(:each) do
|
27
|
+
subject.checkpoint_cursor(intermediate: false) # store seed value
|
28
|
+
[
|
29
|
+
Thread.new(subject) {|subject| subject.record_last_value(LogStash::Event.new("my_field" => "2025-01-03T23:59:59.999999999Z")) },
|
30
|
+
Thread.new(subject) {|subject| subject.record_last_value(LogStash::Event.new("my_field" => "2025-01-01T23:59:59.999999999Z")) },
|
31
|
+
Thread.new(subject) {|subject| subject.record_last_value(LogStash::Event.new("my_field" => "2025-01-02T23:59:59.999999999Z")) },
|
32
|
+
].each(&:join)
|
33
|
+
end
|
34
|
+
context "when doing intermediate checkpoint" do
|
35
|
+
it "persists the smallest value" do
|
36
|
+
subject.checkpoint_cursor(intermediate: true)
|
37
|
+
expect(IO.read(last_run_metadata_path)).to eq("2025-01-01T23:59:59.999999999Z")
|
38
|
+
end
|
39
|
+
end
|
40
|
+
context "when doing non-intermediate checkpoint" do
|
41
|
+
it "persists the largest value" do
|
42
|
+
subject.checkpoint_cursor(intermediate: false)
|
43
|
+
expect(IO.read(last_run_metadata_path)).to eq("2025-01-03T23:59:59.999999999Z")
|
44
|
+
end
|
45
|
+
end
|
46
|
+
end
|
47
|
+
|
48
|
+
describe "inject_cursor" do
|
49
|
+
let(:new_value) { "2025-01-03T23:59:59.999999999Z" }
|
50
|
+
let(:fake_now) { "2026-09-19T23:59:59.999999999Z" }
|
51
|
+
|
52
|
+
let(:query) do
|
53
|
+
%q[
|
54
|
+
{ "query": { "range": { "event.ingested": { "gt": :last_value, "lt": :present}}}, "sort": [ { "event.ingested": {"order": "asc", "format": "strict_date_optional_time_nanos", "numeric_type" : "date_nanos" } } ] }
|
55
|
+
]
|
56
|
+
end
|
57
|
+
|
58
|
+
before(:each) do
|
59
|
+
subject.record_last_value(LogStash::Event.new("my_field" => new_value))
|
60
|
+
subject.checkpoint_cursor(intermediate: false)
|
61
|
+
allow(subject).to receive(:now_minus_30s).and_return(fake_now)
|
62
|
+
end
|
63
|
+
|
64
|
+
it "injects the value of the cursor into json query if it contains :last_value" do
|
65
|
+
expect(subject.inject_cursor(query)).to match(/#{new_value}/)
|
66
|
+
end
|
67
|
+
|
68
|
+
it "injects current time into json query if it contains :present" do
|
69
|
+
expect(subject.inject_cursor(query)).to match(/#{fake_now}/)
|
70
|
+
end
|
71
|
+
end
|
72
|
+
end
|
@@ -21,6 +21,13 @@ describe LogStash::Inputs::Elasticsearch, :ecs_compatibility_support do
|
|
21
21
|
let(:es_version) { "7.5.0" }
|
22
22
|
let(:cluster_info) { {"version" => {"number" => es_version, "build_flavor" => build_flavor}, "tagline" => "You Know, for Search"} }
|
23
23
|
|
24
|
+
def elastic_ruby_v8_client_available?
|
25
|
+
Elasticsearch::Transport
|
26
|
+
false
|
27
|
+
rescue NameError # NameError: uninitialized constant Elasticsearch::Transport if Elastic Ruby client is not available
|
28
|
+
true
|
29
|
+
end
|
30
|
+
|
24
31
|
before(:each) do
|
25
32
|
Elasticsearch::Client.send(:define_method, :ping) { } # define no-action ping method
|
26
33
|
allow_any_instance_of(Elasticsearch::Client).to receive(:info).and_return(cluster_info)
|
@@ -92,9 +99,11 @@ describe LogStash::Inputs::Elasticsearch, :ecs_compatibility_support do
|
|
92
99
|
|
93
100
|
before do
|
94
101
|
allow(Elasticsearch::Client).to receive(:new).and_return(es_client)
|
95
|
-
|
96
|
-
|
97
|
-
|
102
|
+
if elastic_ruby_v8_client_available?
|
103
|
+
allow(es_client).to receive(:info).and_raise(Elastic::Transport::Transport::Errors::BadRequest.new)
|
104
|
+
else
|
105
|
+
allow(es_client).to receive(:info).and_raise(Elasticsearch::Transport::Transport::Errors::BadRequest.new)
|
106
|
+
end
|
98
107
|
end
|
99
108
|
|
100
109
|
it "raises an exception" do
|
@@ -666,11 +675,28 @@ describe LogStash::Inputs::Elasticsearch, :ecs_compatibility_support do
|
|
666
675
|
context 'if the `docinfo_target` exist but is not of type hash' do
|
667
676
|
let(:config) { base_config.merge 'docinfo' => true, "docinfo_target" => 'metadata_with_string' }
|
668
677
|
let(:do_register) { false }
|
678
|
+
let(:mock_queue) { double('Queue', :<< => nil) }
|
679
|
+
let(:hit) { response.dig('hits', 'hits').first }
|
680
|
+
|
681
|
+
it 'emits a tagged event with JSON-serialized event in [event][original]' do
|
682
|
+
allow(plugin).to receive(:logger).and_return(double('Logger').as_null_object)
|
669
683
|
|
670
|
-
it 'raises an exception if the `docinfo_target` exist but is not of type hash' do
|
671
|
-
expect(client).not_to receive(:clear_scroll)
|
672
684
|
plugin.register
|
673
|
-
|
685
|
+
plugin.run(mock_queue)
|
686
|
+
|
687
|
+
expect(mock_queue).to have_received(:<<) do |event|
|
688
|
+
expect(event).to be_a_kind_of LogStash::Event
|
689
|
+
|
690
|
+
expect(event.get('tags')).to include("_elasticsearch_input_failure")
|
691
|
+
expect(event.get('[event][original]')).to be_a_kind_of String
|
692
|
+
expect(JSON.load(event.get('[event][original]'))).to eq hit
|
693
|
+
end
|
694
|
+
|
695
|
+
expect(plugin.logger)
|
696
|
+
.to have_received(:warn).with(
|
697
|
+
a_string_including("Event creation error, original data now in [event][original] field"),
|
698
|
+
a_hash_including(:message => a_string_including('unable to merge docinfo fields into docinfo_target=`metadata_with_string`'),
|
699
|
+
:data => a_string_including('"_id":"C5b2xLQwTZa76jBmHIbwHQ"')))
|
674
700
|
end
|
675
701
|
|
676
702
|
end
|
@@ -727,8 +753,13 @@ describe LogStash::Inputs::Elasticsearch, :ecs_compatibility_support do
|
|
727
753
|
it "should set host(s)" do
|
728
754
|
plugin.register
|
729
755
|
client = plugin.send(:client)
|
730
|
-
|
731
|
-
|
756
|
+
target_field = :@seeds
|
757
|
+
begin
|
758
|
+
Elasticsearch::Transport::Client
|
759
|
+
rescue
|
760
|
+
target_field = :@hosts
|
761
|
+
end
|
762
|
+
expect( client.transport.instance_variable_get(target_field) ).to eql [{
|
732
763
|
:scheme => "https",
|
733
764
|
:host => "ac31ebb90241773157043c34fd26fd46.us-central1.gcp.cloud.es.io",
|
734
765
|
:port => 9243,
|
@@ -1134,7 +1165,7 @@ describe LogStash::Inputs::Elasticsearch, :ecs_compatibility_support do
|
|
1134
1165
|
|
1135
1166
|
context "when there's an exception" do
|
1136
1167
|
before(:each) do
|
1137
|
-
allow(client).to receive(:search).and_raise RuntimeError
|
1168
|
+
allow(client).to receive(:search).and_raise RuntimeError.new("test exception")
|
1138
1169
|
end
|
1139
1170
|
it 'produces no events' do
|
1140
1171
|
plugin.run queue
|
@@ -1248,6 +1279,92 @@ describe LogStash::Inputs::Elasticsearch, :ecs_compatibility_support do
|
|
1248
1279
|
end
|
1249
1280
|
end
|
1250
1281
|
|
1282
|
+
context '#push_hit' do
|
1283
|
+
let(:config) do
|
1284
|
+
{
|
1285
|
+
'docinfo' => true, # include ids
|
1286
|
+
'docinfo_target' => '[@metadata][docinfo]'
|
1287
|
+
}
|
1288
|
+
end
|
1289
|
+
|
1290
|
+
let(:hit) do
|
1291
|
+
JSON.load(<<~EOJSON)
|
1292
|
+
{
|
1293
|
+
"_index" : "test_bulk_index_2",
|
1294
|
+
"_type" : "_doc",
|
1295
|
+
"_id" : "sHe6A3wBesqF7ydicQvG",
|
1296
|
+
"_score" : 1.0,
|
1297
|
+
"_source" : {
|
1298
|
+
"@timestamp" : "2021-09-20T15:02:02.557Z",
|
1299
|
+
"message" : "ping",
|
1300
|
+
"@version" : "17",
|
1301
|
+
"sequence" : 7,
|
1302
|
+
"host" : {
|
1303
|
+
"name" : "maybe.local",
|
1304
|
+
"ip" : "127.0.0.1"
|
1305
|
+
}
|
1306
|
+
}
|
1307
|
+
}
|
1308
|
+
EOJSON
|
1309
|
+
end
|
1310
|
+
|
1311
|
+
let(:mock_queue) { double('queue', :<< => nil) }
|
1312
|
+
|
1313
|
+
before(:each) do
|
1314
|
+
plugin.send(:setup_cursor_tracker)
|
1315
|
+
end
|
1316
|
+
|
1317
|
+
it 'pushes a generated event to the queue' do
|
1318
|
+
plugin.send(:push_hit, hit, mock_queue)
|
1319
|
+
expect(mock_queue).to have_received(:<<) do |event|
|
1320
|
+
expect(event).to be_a_kind_of LogStash::Event
|
1321
|
+
|
1322
|
+
# fields overriding defaults
|
1323
|
+
expect(event.timestamp.to_s).to eq("2021-09-20T15:02:02.557Z")
|
1324
|
+
expect(event.get('@version')).to eq("17")
|
1325
|
+
|
1326
|
+
# structure from hit's _source
|
1327
|
+
expect(event.get('message')).to eq("ping")
|
1328
|
+
expect(event.get('sequence')).to eq(7)
|
1329
|
+
expect(event.get('[host][name]')).to eq("maybe.local")
|
1330
|
+
expect(event.get('[host][ip]')).to eq("127.0.0.1")
|
1331
|
+
|
1332
|
+
# docinfo fields
|
1333
|
+
expect(event.get('[@metadata][docinfo][_index]')).to eq("test_bulk_index_2")
|
1334
|
+
expect(event.get('[@metadata][docinfo][_type]')).to eq("_doc")
|
1335
|
+
expect(event.get('[@metadata][docinfo][_id]')).to eq("sHe6A3wBesqF7ydicQvG")
|
1336
|
+
end
|
1337
|
+
end
|
1338
|
+
|
1339
|
+
context 'when event creation fails' do
|
1340
|
+
before(:each) do
|
1341
|
+
allow(plugin).to receive(:logger).and_return(double('Logger').as_null_object)
|
1342
|
+
|
1343
|
+
allow(plugin.event_factory).to receive(:new_event).and_call_original
|
1344
|
+
allow(plugin.event_factory).to receive(:new_event).with(a_hash_including hit['_source']).and_raise(RuntimeError, 'intentional')
|
1345
|
+
end
|
1346
|
+
|
1347
|
+
it 'pushes a tagged event containing a JSON-encoded hit in [event][original]' do
|
1348
|
+
plugin.send(:push_hit, hit, mock_queue)
|
1349
|
+
|
1350
|
+
expect(mock_queue).to have_received(:<<) do |event|
|
1351
|
+
expect(event).to be_a_kind_of LogStash::Event
|
1352
|
+
|
1353
|
+
expect(event.get('tags')).to include("_elasticsearch_input_failure")
|
1354
|
+
expect(event.get('[event][original]')).to be_a_kind_of String
|
1355
|
+
expect(JSON.load(event.get('[event][original]'))).to eq hit
|
1356
|
+
end
|
1357
|
+
|
1358
|
+
expect(plugin.logger)
|
1359
|
+
.to have_received(:warn).with(
|
1360
|
+
a_string_including("Event creation error, original data now in [event][original] field"),
|
1361
|
+
a_hash_including(:message => a_string_including('intentional'),
|
1362
|
+
:data => a_string_including('"_id":"sHe6A3wBesqF7ydicQvG"')))
|
1363
|
+
|
1364
|
+
end
|
1365
|
+
end
|
1366
|
+
end
|
1367
|
+
|
1251
1368
|
# @note can be removed once we depends on elasticsearch gem >= 6.x
|
1252
1369
|
def extract_transport(client) # on 7.x client.transport is a ES::Transport::Client
|
1253
1370
|
client.transport.respond_to?(:transport) ? client.transport.transport : client.transport
|
@@ -4,7 +4,7 @@ require "logstash/plugin"
|
|
4
4
|
require "logstash/inputs/elasticsearch"
|
5
5
|
require_relative "../../../spec/es_helper"
|
6
6
|
|
7
|
-
describe LogStash::Inputs::Elasticsearch
|
7
|
+
describe LogStash::Inputs::Elasticsearch do
|
8
8
|
|
9
9
|
SECURE_INTEGRATION = ENV['SECURE_INTEGRATION'].eql? 'true'
|
10
10
|
|
@@ -76,6 +76,14 @@ describe LogStash::Inputs::Elasticsearch, :integration => true do
|
|
76
76
|
shared_examples 'secured_elasticsearch' do
|
77
77
|
it_behaves_like 'an elasticsearch index plugin'
|
78
78
|
|
79
|
+
let(:unauth_exception_class) do
|
80
|
+
begin
|
81
|
+
Elasticsearch::Transport::Transport::Errors::Unauthorized
|
82
|
+
rescue
|
83
|
+
Elastic::Transport::Transport::Errors::Unauthorized
|
84
|
+
end
|
85
|
+
end
|
86
|
+
|
79
87
|
context "incorrect auth credentials" do
|
80
88
|
|
81
89
|
let(:config) do
|
@@ -85,7 +93,7 @@ describe LogStash::Inputs::Elasticsearch, :integration => true do
|
|
85
93
|
let(:queue) { [] }
|
86
94
|
|
87
95
|
it "fails to run the plugin" do
|
88
|
-
expect { plugin.register }.to raise_error
|
96
|
+
expect { plugin.register }.to raise_error unauth_exception_class
|
89
97
|
end
|
90
98
|
end
|
91
99
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: logstash-input-elasticsearch
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 5.
|
4
|
+
version: 5.1.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Elastic
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2025-04-07 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
requirement: !ruby/object:Gem::Requirement
|
@@ -92,6 +92,9 @@ dependencies:
|
|
92
92
|
- - ">="
|
93
93
|
- !ruby/object:Gem::Version
|
94
94
|
version: 7.17.9
|
95
|
+
- - "<"
|
96
|
+
- !ruby/object:Gem::Version
|
97
|
+
version: '9'
|
95
98
|
name: elasticsearch
|
96
99
|
type: :runtime
|
97
100
|
prerelease: false
|
@@ -100,6 +103,9 @@ dependencies:
|
|
100
103
|
- - ">="
|
101
104
|
- !ruby/object:Gem::Version
|
102
105
|
version: 7.17.9
|
106
|
+
- - "<"
|
107
|
+
- !ruby/object:Gem::Version
|
108
|
+
version: '9'
|
103
109
|
- !ruby/object:Gem::Dependency
|
104
110
|
requirement: !ruby/object:Gem::Requirement
|
105
111
|
requirements:
|
@@ -272,21 +278,26 @@ files:
|
|
272
278
|
- lib/logstash/helpers/loggable_try.rb
|
273
279
|
- lib/logstash/inputs/elasticsearch.rb
|
274
280
|
- lib/logstash/inputs/elasticsearch/aggregation.rb
|
281
|
+
- lib/logstash/inputs/elasticsearch/cursor_tracker.rb
|
275
282
|
- lib/logstash/inputs/elasticsearch/paginated_search.rb
|
276
283
|
- lib/logstash/inputs/elasticsearch/patches/_elasticsearch_transport_connections_selector.rb
|
277
284
|
- lib/logstash/inputs/elasticsearch/patches/_elasticsearch_transport_http_manticore.rb
|
278
285
|
- logstash-input-elasticsearch.gemspec
|
279
286
|
- spec/es_helper.rb
|
287
|
+
- spec/fixtures/test_certs/GENERATED_AT
|
280
288
|
- spec/fixtures/test_certs/ca.crt
|
281
289
|
- spec/fixtures/test_certs/ca.der.sha256
|
282
290
|
- spec/fixtures/test_certs/ca.key
|
291
|
+
- spec/fixtures/test_certs/es.chain.crt
|
283
292
|
- spec/fixtures/test_certs/es.crt
|
284
293
|
- spec/fixtures/test_certs/es.key
|
294
|
+
- spec/fixtures/test_certs/renew.sh
|
295
|
+
- spec/inputs/cursor_tracker_spec.rb
|
285
296
|
- spec/inputs/elasticsearch_spec.rb
|
286
297
|
- spec/inputs/elasticsearch_ssl_spec.rb
|
287
298
|
- spec/inputs/integration/elasticsearch_spec.rb
|
288
299
|
- spec/inputs/paginated_search_spec.rb
|
289
|
-
homepage:
|
300
|
+
homepage: https://elastic.co/logstash
|
290
301
|
licenses:
|
291
302
|
- Apache License (2.0)
|
292
303
|
metadata:
|
@@ -313,11 +324,15 @@ specification_version: 4
|
|
313
324
|
summary: Reads query results from an Elasticsearch cluster
|
314
325
|
test_files:
|
315
326
|
- spec/es_helper.rb
|
327
|
+
- spec/fixtures/test_certs/GENERATED_AT
|
316
328
|
- spec/fixtures/test_certs/ca.crt
|
317
329
|
- spec/fixtures/test_certs/ca.der.sha256
|
318
330
|
- spec/fixtures/test_certs/ca.key
|
331
|
+
- spec/fixtures/test_certs/es.chain.crt
|
319
332
|
- spec/fixtures/test_certs/es.crt
|
320
333
|
- spec/fixtures/test_certs/es.key
|
334
|
+
- spec/fixtures/test_certs/renew.sh
|
335
|
+
- spec/inputs/cursor_tracker_spec.rb
|
321
336
|
- spec/inputs/elasticsearch_spec.rb
|
322
337
|
- spec/inputs/elasticsearch_ssl_spec.rb
|
323
338
|
- spec/inputs/integration/elasticsearch_spec.rb
|