logstash-filter-elasticsearch 4.1.1 → 4.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +6 -0
- data/docs/index.asciidoc +181 -9
- data/lib/logstash/filters/elasticsearch/client.rb +8 -0
- data/lib/logstash/filters/elasticsearch/dsl_executor.rb +140 -0
- data/lib/logstash/filters/elasticsearch/esql_executor.rb +178 -0
- data/lib/logstash/filters/elasticsearch.rb +114 -114
- data/logstash-filter-elasticsearch.gemspec +3 -1
- data/spec/filters/elasticsearch_dsl_spec.rb +372 -0
- data/spec/filters/elasticsearch_esql_spec.rb +211 -0
- data/spec/filters/elasticsearch_spec.rb +140 -310
- data/spec/filters/integration/elasticsearch_esql_spec.rb +167 -0
- data/spec/filters/integration/elasticsearch_spec.rb +9 -2
- metadata +38 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: deef535881ed311e6241449028bf6b54b65e0a42e6513b1ab7286de724f5445f
|
4
|
+
data.tar.gz: 103f17c0b6d6530c90afc412f69c05dc29ec4f76aecb7f82e22d7efd4f497647
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 54b3f3b78874934793ac7d7dca148236578b7756a20bddb70d020e14f9b222fbc70b4c8f2b7f132b238ed9d9690d4ac8e684749e959aa64d3ccd0b58aded7790
|
7
|
+
data.tar.gz: 98e18b162c4dd25827fb1b74ec62d75d152458950b8ee157e635a6c772247774aeaa82ec2ca5031e808361a02b454360438e4fd717deb07987d3d2519e00a327
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,9 @@
|
|
1
|
+
## 4.3.0
|
2
|
+
- ES|QL support [#194](https://github.com/logstash-plugins/logstash-filter-elasticsearch/pull/194)
|
3
|
+
|
4
|
+
## 4.2.0
|
5
|
+
- Add `target` configuration option to store the result into it [#196](https://github.com/logstash-plugins/logstash-filter-elasticsearch/pull/196)
|
6
|
+
|
1
7
|
## 4.1.1
|
2
8
|
- Add elastic-transport client support used in elasticsearch-ruby 8.x [#191](https://github.com/logstash-plugins/logstash-filter-elasticsearch/pull/191)
|
3
9
|
|
data/docs/index.asciidoc
CHANGED
@@ -54,7 +54,7 @@ if [type] == "end" {
|
|
54
54
|
|
55
55
|
The example below reproduces the above example but utilises the query_template.
|
56
56
|
This query_template represents a full Elasticsearch query DSL and supports the
|
57
|
-
standard
|
57
|
+
standard {ls} field substitution syntax. The example below issues
|
58
58
|
the same query as the first example but uses the template shown.
|
59
59
|
|
60
60
|
[source,ruby]
|
@@ -118,6 +118,110 @@ Authentication to a secure Elasticsearch cluster is possible using _one_ of the
|
|
118
118
|
Authorization to a secure Elasticsearch cluster requires `read` permission at index level and `monitoring` permissions at cluster level.
|
119
119
|
The `monitoring` permission at cluster level is necessary to perform periodic connectivity checks.
|
120
120
|
|
121
|
+
[id="plugins-{type}s-{plugin}-esql"]
|
122
|
+
==== {esql} support
|
123
|
+
|
124
|
+
.Technical Preview
|
125
|
+
****
|
126
|
+
The {esql} feature that allows using ES|QL queries with this plugin is in Technical Preview.
|
127
|
+
Configuration options and implementation details are subject to change in minor releases without being preceded by deprecation warnings.
|
128
|
+
****
|
129
|
+
|
130
|
+
{es} Query Language ({esql}) provides a SQL-like interface for querying your {es} data.
|
131
|
+
|
132
|
+
To use {esql}, this plugin needs to be installed in {ls} 8.17.4 or newer, and must be connected to {es} 8.11 or newer.
|
133
|
+
|
134
|
+
To configure {esql} query in the plugin, set your {esql} query in the `query` parameter.
|
135
|
+
|
136
|
+
IMPORTANT: We recommend understanding {ref}/esql-limitations.html[{esql} current limitations] before using it in production environments.
|
137
|
+
|
138
|
+
The following is a basic {esql} query that sets the food name to transaction event based on upstream event's food ID:
|
139
|
+
[source, ruby]
|
140
|
+
filter {
|
141
|
+
elasticsearch {
|
142
|
+
hosts => [ 'https://..']
|
143
|
+
api_key => '....'
|
144
|
+
query => '
|
145
|
+
FROM food-index
|
146
|
+
| WHERE id == ?food_id
|
147
|
+
'
|
148
|
+
query_params => {
|
149
|
+
"food_id" => "[food][id]"
|
150
|
+
}
|
151
|
+
}
|
152
|
+
}
|
153
|
+
|
154
|
+
Set `config.support_escapes: true` in `logstash.yml` if you need to escape special chars in the query.
|
155
|
+
|
156
|
+
In the result event, the plugin sets total result size in `[@metadata][total_values]` field.
|
157
|
+
|
158
|
+
[id="plugins-{type}s-{plugin}-esql-event-mapping"]
|
159
|
+
===== Mapping {esql} result to {ls} event
|
160
|
+
{esql} returns query results in a structured tabular format, where data is organized into _columns_ (fields) and _values_ (entries).
|
161
|
+
The plugin maps each value entry to an event, populating corresponding fields.
|
162
|
+
For example, a query might produce a table like:
|
163
|
+
|
164
|
+
[cols="2,1,1,1,2",options="header"]
|
165
|
+
|===
|
166
|
+
|`timestamp` |`user_id` | `action` | `status.code` | `status.desc`
|
167
|
+
|
168
|
+
|2025-04-10T12:00:00 |123 |login |200 | Success
|
169
|
+
|2025-04-10T12:05:00 |456 |purchase |403 | Forbidden (unauthorized user)
|
170
|
+
|===
|
171
|
+
|
172
|
+
For this case, the plugin creates two JSON look like objects as below and places them into the `target` field of the event if `target` is defined.
|
173
|
+
If `target` is not defined, the plugin places the _only_ first result at the root of the event.
|
174
|
+
[source, json]
|
175
|
+
[
|
176
|
+
{
|
177
|
+
"timestamp": "2025-04-10T12:00:00",
|
178
|
+
"user_id": 123,
|
179
|
+
"action": "login",
|
180
|
+
"status": {
|
181
|
+
"code": 200,
|
182
|
+
"desc": "Success"
|
183
|
+
}
|
184
|
+
},
|
185
|
+
{
|
186
|
+
"timestamp": "2025-04-10T12:05:00",
|
187
|
+
"user_id": 456,
|
188
|
+
"action": "purchase",
|
189
|
+
"status": {
|
190
|
+
"code": 403,
|
191
|
+
"desc": "Forbidden (unauthorized user)"
|
192
|
+
}
|
193
|
+
}
|
194
|
+
]
|
195
|
+
|
196
|
+
NOTE: If your index has a mapping with sub-objects where `status.code` and `status.desc` actually dotted fields, they appear in {ls} events as a nested structure.
|
197
|
+
|
198
|
+
[id="plugins-{type}s-{plugin}-esql-multifields"]
|
199
|
+
===== Conflict on multi-fields
|
200
|
+
|
201
|
+
{esql} query fetches all parent and sub-fields fields if your {es} index has https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/multi-fields[multi-fields] or https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/subobjects[subobjects].
|
202
|
+
Since {ls} events cannot contain parent field's concrete value and sub-field values together, the plugin ignores sub-fields with warning and includes parent.
|
203
|
+
We recommend using the `RENAME` (or `DROP` to avoid warning) keyword in your {esql} query explicitly rename the fields to include sub-fields into the event.
|
204
|
+
|
205
|
+
This is a common occurrence if your template or mapping follows the pattern of always indexing strings as "text" (`field`) + " keyword" (`field.keyword`) multi-field.
|
206
|
+
In this case it's recommended to do `KEEP field` if the string is identical and there is only one subfield as the engine will optimize and retrieve the keyword, otherwise you can do `KEEP field.keyword | RENAME field.keyword as field`.
|
207
|
+
|
208
|
+
To illustrate the situation with example, assuming your mapping has a time `time` field with `time.min` and `time.max` sub-fields as following:
|
209
|
+
[source, ruby]
|
210
|
+
"properties": {
|
211
|
+
"time": { "type": "long" },
|
212
|
+
"time.min": { "type": "long" },
|
213
|
+
"time.max": { "type": "long" }
|
214
|
+
}
|
215
|
+
|
216
|
+
The {esql} result will contain all three fields but the plugin cannot map them into {ls} event.
|
217
|
+
To avoid this, you can use the `RENAME` keyword to rename the `time` parent field to get all three fields with unique fields.
|
218
|
+
[source, ruby]
|
219
|
+
...
|
220
|
+
query => 'FROM my-index | RENAME time AS time.current'
|
221
|
+
...
|
222
|
+
|
223
|
+
For comprehensive ES|QL syntax reference and best practices, see the https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-syntax.html[{esql} documentation].
|
224
|
+
|
121
225
|
[id="plugins-{type}s-{plugin}-options"]
|
122
226
|
==== Elasticsearch Filter Configuration Options
|
123
227
|
|
@@ -143,6 +247,8 @@ NOTE: As of version `4.0.0` of this plugin, a number of previously deprecated se
|
|
143
247
|
| <<plugins-{type}s-{plugin}-password>> |<<password,password>>|No
|
144
248
|
| <<plugins-{type}s-{plugin}-proxy>> |<<uri,uri>>|No
|
145
249
|
| <<plugins-{type}s-{plugin}-query>> |<<string,string>>|No
|
250
|
+
| <<plugins-{type}s-{plugin}-query_type>> |<<string,string>>, one of `["dsl", "esql"]`|No
|
251
|
+
| <<plugins-{type}s-{plugin}-query_params>> |<<hash,hash>> or <<hash,hash>>|No
|
146
252
|
| <<plugins-{type}s-{plugin}-query_template>> |<<string,string>>|No
|
147
253
|
| <<plugins-{type}s-{plugin}-result_size>> |<<number,number>>|No
|
148
254
|
| <<plugins-{type}s-{plugin}-retry_on_failure>> |<<number,number>>|No
|
@@ -162,6 +268,7 @@ NOTE: As of version `4.0.0` of this plugin, a number of previously deprecated se
|
|
162
268
|
| <<plugins-{type}s-{plugin}-ssl_truststore_type>> |<<string,string>>|No
|
163
269
|
| <<plugins-{type}s-{plugin}-ssl_verification_mode>> |<<string,string>>, one of `["full", "none"]`|No
|
164
270
|
| <<plugins-{type}s-{plugin}-tag_on_failure>> |<<array,array>>|No
|
271
|
+
| <<plugins-{type}s-{plugin}-target>> |<<string,string>>|No
|
165
272
|
| <<plugins-{type}s-{plugin}-user>> |<<string,string>>|No
|
166
273
|
|=======================================================================
|
167
274
|
|
@@ -175,8 +282,11 @@ filter plugins.
|
|
175
282
|
|
176
283
|
* Value type is <<hash,hash>>
|
177
284
|
* Default value is `{}`
|
285
|
+
* Format: `"aggregation_name" => "[path][on][event]"`:
|
286
|
+
** `aggregation_name`: aggregation name in result from {es}
|
287
|
+
** `[path][on][event]`: path for where to place the value on the current event, using field-reference notation
|
178
288
|
|
179
|
-
|
289
|
+
A mapping of aggregations to copy into the <<plugins-{type}s-{plugin}-target>> of the current event.
|
180
290
|
|
181
291
|
Example:
|
182
292
|
[source,ruby]
|
@@ -246,8 +356,11 @@ These custom headers will override any headers previously set by the plugin such
|
|
246
356
|
|
247
357
|
* Value type is <<hash,hash>>
|
248
358
|
* Default value is `{}`
|
359
|
+
* Format: `"path.in.source" => "[path][on][event]"`:
|
360
|
+
** `path.in.source`: field path in document source of result from {es}, using dot-notation
|
361
|
+
** `[path][on][event]`: path for where to place the value on the current event, using field-reference notation
|
249
362
|
|
250
|
-
|
363
|
+
A mapping of docinfo (`_source`) fields to copy into the <<plugins-{type}s-{plugin}-target>> of the current event.
|
251
364
|
|
252
365
|
Example:
|
253
366
|
[source,ruby]
|
@@ -273,9 +386,11 @@ Whether results should be sorted or not
|
|
273
386
|
|
274
387
|
* Value type is <<array,array>>
|
275
388
|
* Default value is `{}`
|
389
|
+
* Format: `"path.in.result" => "[path][on][event]"`:
|
390
|
+
** `path.in.result`: field path in indexed result from {es}, using dot-notation
|
391
|
+
** `[path][on][event]`: path for where to place the value on the current event, using field-reference notation
|
276
392
|
|
277
|
-
|
278
|
-
new event, currently being processed.
|
393
|
+
A mapping of indexed fields to copy into the <<plugins-{type}s-{plugin}-target>> of the current event.
|
279
394
|
|
280
395
|
In the following example, the values of `@timestamp` and `event_id` on the event
|
281
396
|
found via elasticsearch are copied to the current event's
|
@@ -330,11 +445,30 @@ environment variables e.g. `proxy => '${LS_PROXY:}'`.
|
|
330
445
|
* Value type is <<string,string>>
|
331
446
|
* There is no default value for this setting.
|
332
447
|
|
333
|
-
|
334
|
-
|
335
|
-
string
|
336
|
-
|
448
|
+
The query to be executed.
|
449
|
+
The accepted query shape is DSL query string or ES|QL.
|
450
|
+
For the DSL query string, use either `query` or `query_template`.
|
451
|
+
Read the {ref}/query-dsl-query-string-query.html[{es} query
|
452
|
+
string documentation] or {ref}/esql.html[{es} ES|QL documentation] for more information.
|
337
453
|
|
454
|
+
[id="plugins-{type}s-{plugin}-query_type"]
|
455
|
+
===== `query_type`
|
456
|
+
|
457
|
+
* Value can be `dsl` or `esql`
|
458
|
+
* Default value is `dsl`
|
459
|
+
|
460
|
+
Defines the <<plugins-{type}s-{plugin}-query>> shape.
|
461
|
+
When `dsl`, the query shape must be valid {es} JSON-style string.
|
462
|
+
When `esql`, the query shape must be a valid {esql} string and `index`, `query_template` and `sort` parameters are not allowed.
|
463
|
+
|
464
|
+
[id="plugins-{type}s-{plugin}-query_params"]
|
465
|
+
===== `query_params`
|
466
|
+
|
467
|
+
* The value type is <<hash,hash>> or <<array,array>>. When an array provided, the array elements are pairs of `key` and `value`.
|
468
|
+
* There is no default value for this setting
|
469
|
+
|
470
|
+
Named parameters in {esql} to send to {es} together with <<plugins-{type}s-{plugin}-query>>.
|
471
|
+
Visit {ref}/esql-rest.html#esql-rest-params[passing parameters to query page] for more information.
|
338
472
|
|
339
473
|
[id="plugins-{type}s-{plugin}-query_template"]
|
340
474
|
===== `query_template`
|
@@ -523,6 +657,44 @@ WARNING: Setting certificate verification to `none` disables many security benef
|
|
523
657
|
|
524
658
|
Tags the event on failure to look up previous log event information. This can be used in later analysis.
|
525
659
|
|
660
|
+
[id="plugins-{type}s-{plugin}-target"]
|
661
|
+
===== `target`
|
662
|
+
|
663
|
+
* Value type is <<string,string>>
|
664
|
+
* There is no default value for this setting.
|
665
|
+
|
666
|
+
Define the target field for placing the result data.
|
667
|
+
If this setting is omitted, the target will be the root (top level) of the event.
|
668
|
+
It is highly recommended to set when using `query_type=>'esql'` to set all query results into the event.
|
669
|
+
|
670
|
+
When `query_type=>'dsl'`, the destination fields specified in <<plugins-{type}s-{plugin}-fields>>, <<plugins-{type}s-{plugin}-aggregation_fields>>, and <<plugins-{type}s-{plugin}-docinfo_fields>> are relative to this target.
|
671
|
+
|
672
|
+
For example, if you want the data to be put in the `operation` field:
|
673
|
+
[source,ruby]
|
674
|
+
if [type] == "end" {
|
675
|
+
filter {
|
676
|
+
query => "type:start AND transaction:%{[transactionId]}"
|
677
|
+
elasticsearch {
|
678
|
+
target => "transaction"
|
679
|
+
fields => {
|
680
|
+
"@timestamp" => "started"
|
681
|
+
"transaction_id" => "id"
|
682
|
+
}
|
683
|
+
}
|
684
|
+
}
|
685
|
+
}
|
686
|
+
|
687
|
+
`fields` fields will be expanded into a data structure in the `target` field, overall shape looks like this:
|
688
|
+
[source,ruby]
|
689
|
+
{
|
690
|
+
"transaction" => {
|
691
|
+
"started" => "2025-04-29T12:01:46.263Z"
|
692
|
+
"id" => "1234567890"
|
693
|
+
}
|
694
|
+
}
|
695
|
+
|
696
|
+
NOTE: when writing to a field that already exists on the event, the previous value will be overwritten.
|
697
|
+
|
526
698
|
[id="plugins-{type}s-{plugin}-user"]
|
527
699
|
===== `user`
|
528
700
|
|
@@ -58,11 +58,19 @@ module LogStash
|
|
58
58
|
def search(params={})
|
59
59
|
@client.search(params)
|
60
60
|
end
|
61
|
+
|
62
|
+
def esql_query(params={})
|
63
|
+
@client.esql.query(params)
|
64
|
+
end
|
61
65
|
|
62
66
|
def info
|
63
67
|
@client.info
|
64
68
|
end
|
65
69
|
|
70
|
+
def es_version
|
71
|
+
info&.dig('version', 'number')
|
72
|
+
end
|
73
|
+
|
66
74
|
def build_flavor
|
67
75
|
@build_flavor ||= info&.dig('version', 'build_flavor')
|
68
76
|
end
|
@@ -0,0 +1,140 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
|
3
|
+
module LogStash
|
4
|
+
module Filters
|
5
|
+
class Elasticsearch
|
6
|
+
class DslExecutor
|
7
|
+
def initialize(plugin, logger)
|
8
|
+
@index = plugin.params["index"]
|
9
|
+
@query = plugin.params["query"]
|
10
|
+
@query_dsl = plugin.query_dsl
|
11
|
+
@fields = plugin.params["fields"]
|
12
|
+
@result_size = plugin.params["result_size"]
|
13
|
+
@docinfo_fields = plugin.params["docinfo_fields"]
|
14
|
+
@tag_on_failure = plugin.params["tag_on_failure"]
|
15
|
+
@enable_sort = plugin.params["enable_sort"]
|
16
|
+
@sort = plugin.params["sort"]
|
17
|
+
@aggregation_fields = plugin.params["aggregation_fields"]
|
18
|
+
@logger = logger
|
19
|
+
@event_decorator = plugin.method(:decorate)
|
20
|
+
@target_field = plugin.params["target"]
|
21
|
+
if @target_field
|
22
|
+
def self.apply_target(path); "[#{@target_field}][#{path}]"; end
|
23
|
+
else
|
24
|
+
def self.apply_target(path); path; end
|
25
|
+
end
|
26
|
+
end
|
27
|
+
|
28
|
+
def process(client, event)
|
29
|
+
matched = false
|
30
|
+
begin
|
31
|
+
params = { :index => event.sprintf(@index) }
|
32
|
+
|
33
|
+
if @query_dsl
|
34
|
+
query = LogStash::Json.load(event.sprintf(@query_dsl))
|
35
|
+
params[:body] = query
|
36
|
+
else
|
37
|
+
query = event.sprintf(@query)
|
38
|
+
params[:q] = query
|
39
|
+
params[:size] = @result_size
|
40
|
+
params[:sort] = @sort if @enable_sort
|
41
|
+
end
|
42
|
+
|
43
|
+
@logger.debug("Querying elasticsearch for lookup", :params => params)
|
44
|
+
|
45
|
+
results = client.search(params)
|
46
|
+
raise "Elasticsearch query error: #{results["_shards"]["failures"]}" if results["_shards"].include? "failures"
|
47
|
+
|
48
|
+
event.set("[@metadata][total_hits]", extract_total_from_hits(results['hits']))
|
49
|
+
|
50
|
+
result_hits = results["hits"]["hits"]
|
51
|
+
if !result_hits.nil? && !result_hits.empty?
|
52
|
+
matched = true
|
53
|
+
@fields.each do |old_key, new_key|
|
54
|
+
old_key_path = extract_path(old_key)
|
55
|
+
extracted_hit_values = result_hits.map do |doc|
|
56
|
+
extract_value(doc["_source"], old_key_path)
|
57
|
+
end
|
58
|
+
value_to_set = extracted_hit_values.count > 1 ? extracted_hit_values : extracted_hit_values.first
|
59
|
+
set_to_event_target(event, new_key, value_to_set)
|
60
|
+
end
|
61
|
+
@docinfo_fields.each do |old_key, new_key|
|
62
|
+
old_key_path = extract_path(old_key)
|
63
|
+
extracted_docs_info = result_hits.map do |doc|
|
64
|
+
extract_value(doc, old_key_path)
|
65
|
+
end
|
66
|
+
value_to_set = extracted_docs_info.count > 1 ? extracted_docs_info : extracted_docs_info.first
|
67
|
+
set_to_event_target(event, new_key, value_to_set)
|
68
|
+
end
|
69
|
+
end
|
70
|
+
|
71
|
+
result_aggregations = results["aggregations"]
|
72
|
+
if !result_aggregations.nil? && !result_aggregations.empty?
|
73
|
+
matched = true
|
74
|
+
@aggregation_fields.each do |agg_name, ls_field|
|
75
|
+
set_to_event_target(event, ls_field, result_aggregations[agg_name])
|
76
|
+
end
|
77
|
+
end
|
78
|
+
|
79
|
+
rescue => e
|
80
|
+
if @logger.trace?
|
81
|
+
@logger.warn("Failed to query elasticsearch for previous event", :index => @index, :query => @query, :event => event.to_hash, :error => e.message, :backtrace => e.backtrace)
|
82
|
+
elsif @logger.debug?
|
83
|
+
@logger.warn("Failed to query elasticsearch for previous event", :index => @index, :error => e.message, :backtrace => e.backtrace)
|
84
|
+
else
|
85
|
+
@logger.warn("Failed to query elasticsearch for previous event", :index => @index, :error => e.message)
|
86
|
+
end
|
87
|
+
@tag_on_failure.each { |tag| event.tag(tag) }
|
88
|
+
else
|
89
|
+
@event_decorator.call(event) if matched
|
90
|
+
end
|
91
|
+
end
|
92
|
+
|
93
|
+
private
|
94
|
+
|
95
|
+
# Given a "hits" object from an Elasticsearch response, return the total number of hits in
|
96
|
+
# the result set.
|
97
|
+
# @param hits [Hash{String=>Object}]
|
98
|
+
# @return [Integer]
|
99
|
+
def extract_total_from_hits(hits)
|
100
|
+
total = hits['total']
|
101
|
+
|
102
|
+
# Elasticsearch 7.x produces an object containing `value` and `relation` in order
|
103
|
+
# to enable unambiguous reporting when the total is only a lower bound; if we get
|
104
|
+
# an object back, return its `value`.
|
105
|
+
return total['value'] if total.kind_of?(Hash)
|
106
|
+
total
|
107
|
+
end
|
108
|
+
|
109
|
+
# get an array of path elements from a path reference
|
110
|
+
def extract_path(path_reference)
|
111
|
+
return [path_reference] unless path_reference.start_with?('[') && path_reference.end_with?(']')
|
112
|
+
|
113
|
+
path_reference[1...-1].split('][')
|
114
|
+
end
|
115
|
+
|
116
|
+
# given a Hash and an array of path fragments, returns the value at the path
|
117
|
+
# @param source [Hash{String=>Object}]
|
118
|
+
# @param path [Array{String}]
|
119
|
+
# @return [Object]
|
120
|
+
def extract_value(source, path)
|
121
|
+
path.reduce(source) do |memo, old_key_fragment|
|
122
|
+
break unless memo.include?(old_key_fragment)
|
123
|
+
memo[old_key_fragment]
|
124
|
+
end
|
125
|
+
end
|
126
|
+
|
127
|
+
# if @target is defined, creates a nested structure to inject result into target field
|
128
|
+
# if not defined, directly sets to the top-level event field
|
129
|
+
# @param event [LogStash::Event]
|
130
|
+
# @param new_key [String] name of the field to set
|
131
|
+
# @param value_to_set [Array] values to set
|
132
|
+
# @return [void]
|
133
|
+
def set_to_event_target(event, new_key, value_to_set)
|
134
|
+
key_to_set = self.apply_target(new_key)
|
135
|
+
event.set(key_to_set, value_to_set)
|
136
|
+
end
|
137
|
+
end
|
138
|
+
end
|
139
|
+
end
|
140
|
+
end
|
@@ -0,0 +1,178 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
|
3
|
+
module LogStash
|
4
|
+
module Filters
|
5
|
+
class Elasticsearch
|
6
|
+
class EsqlExecutor
|
7
|
+
|
8
|
+
ESQL_PARSERS_BY_TYPE = Hash.new(lambda { |x| x }).merge(
|
9
|
+
'date' => ->(value) { value && LogStash::Timestamp.new(value) },
|
10
|
+
)
|
11
|
+
|
12
|
+
def initialize(plugin, logger)
|
13
|
+
@logger = logger
|
14
|
+
|
15
|
+
@event_decorator = plugin.method(:decorate)
|
16
|
+
@query = plugin.params["query"]
|
17
|
+
|
18
|
+
query_params = plugin.query_params || {}
|
19
|
+
reference_valued_params, static_valued_params = query_params.partition { |_, v| v.kind_of?(String) && v.match?(/^\[.*\]$/) }
|
20
|
+
@referenced_params = reference_valued_params&.to_h
|
21
|
+
# keep static params as an array of hashes to attach to the ES|QL api param easily
|
22
|
+
@static_params = static_valued_params.map { |k, v| { k => v } }
|
23
|
+
@tag_on_failure = plugin.params["tag_on_failure"]
|
24
|
+
@logger.debug("ES|QL query executor initialized with ", query: @query, query_params: query_params)
|
25
|
+
|
26
|
+
# if the target is specified, all result entries will be copied to the target field
|
27
|
+
# otherwise, the first value of the result will be copied to the event
|
28
|
+
@target_field = plugin.params["target"]
|
29
|
+
@logger.warn("Only first query result will be copied to the event. Please specify `target` in plugin config to include all") if @target_field.nil?
|
30
|
+
end
|
31
|
+
|
32
|
+
def process(client, event)
|
33
|
+
resolved_params = @referenced_params&.any? ? resolve_parameters(event) : []
|
34
|
+
resolved_params.concat(@static_params) if @static_params&.any?
|
35
|
+
response = execute_query(client, resolved_params)
|
36
|
+
inform_warning(response)
|
37
|
+
process_response(event, response)
|
38
|
+
@event_decorator.call(event)
|
39
|
+
rescue => e
|
40
|
+
@logger.error("Failed to process ES|QL filter", exception: e)
|
41
|
+
@tag_on_failure.each { |tag| event.tag(tag) }
|
42
|
+
end
|
43
|
+
|
44
|
+
private
|
45
|
+
|
46
|
+
def resolve_parameters(event)
|
47
|
+
@referenced_params.map do |key, value|
|
48
|
+
begin
|
49
|
+
resolved_value = event.get(value)
|
50
|
+
@logger.debug("Resolved value for #{key}: #{resolved_value}, its class: #{resolved_value.class}")
|
51
|
+
{ key => resolved_value }
|
52
|
+
rescue => e
|
53
|
+
# catches invalid field reference
|
54
|
+
raise "Failed to resolve parameter `#{key}` with `#{value}`. Error: #{e.message}"
|
55
|
+
end
|
56
|
+
end
|
57
|
+
end
|
58
|
+
|
59
|
+
def execute_query(client, params)
|
60
|
+
# debug logs may help to check what query shape the plugin is sending to ES
|
61
|
+
@logger.debug("Executing ES|QL query", query: @query, params: params)
|
62
|
+
client.esql_query({ body: { query: @query, params: params }, format: 'json', drop_null_columns: true })
|
63
|
+
end
|
64
|
+
|
65
|
+
def process_response(event, response)
|
66
|
+
columns = response['columns']&.freeze || []
|
67
|
+
values = response['values']&.freeze || []
|
68
|
+
if values.nil? || values.size == 0
|
69
|
+
@logger.debug("Empty ES|QL query result", columns: columns, values: values)
|
70
|
+
return
|
71
|
+
end
|
72
|
+
|
73
|
+
# this shouldn't happen but just in case to avoid crashes the plugin
|
74
|
+
if columns.nil? || columns.size == 0
|
75
|
+
@logger.error("No columns exist but received values", columns: columns, values: values)
|
76
|
+
return
|
77
|
+
end
|
78
|
+
|
79
|
+
event.set("[@metadata][total_values]", values.size)
|
80
|
+
@logger.debug("ES|QL query result values size ", size: values.size)
|
81
|
+
|
82
|
+
column_specs = columns.map { |column| ColumnSpec.new(column) }
|
83
|
+
sub_element_mark_map = mark_sub_elements(column_specs)
|
84
|
+
multi_fields = sub_element_mark_map.filter_map { |key, val| key.name if val == true }
|
85
|
+
|
86
|
+
@logger.debug("Multi-fields found in ES|QL result and they will not be available in the event. Please use `RENAME` command if you want to include them.", { :detected_multi_fields => multi_fields }) if multi_fields.any?
|
87
|
+
|
88
|
+
if @target_field
|
89
|
+
values_to_set = values.map do |row|
|
90
|
+
mapped_data = column_specs.each_with_index.with_object({}) do |(column, index), mapped_data|
|
91
|
+
# `unless value.nil?` is a part of `drop_null_columns` that if some of the columns' values are not `nil`, `nil` values appear,
|
92
|
+
# we should continuously filter them out to achieve full `drop_null_columns` on each individual row (ideal `LIMIT 1` result)
|
93
|
+
# we also exclude sub-elements of the base field
|
94
|
+
if row[index] && sub_element_mark_map[column] == false
|
95
|
+
value_to_set = ESQL_PARSERS_BY_TYPE[column.type].call(row[index])
|
96
|
+
mapped_data[column.name] = value_to_set
|
97
|
+
end
|
98
|
+
end
|
99
|
+
generate_nested_structure(mapped_data) unless mapped_data.empty?
|
100
|
+
end
|
101
|
+
event.set("[#{@target_field}]", values_to_set)
|
102
|
+
else
|
103
|
+
column_specs.zip(values.first).each do |(column, value) |
|
104
|
+
if value && sub_element_mark_map[column] == false
|
105
|
+
value_to_set = ESQL_PARSERS_BY_TYPE[column.type].call(value)
|
106
|
+
event.set(column.field_reference, value_to_set)
|
107
|
+
end
|
108
|
+
end
|
109
|
+
end
|
110
|
+
end
|
111
|
+
|
112
|
+
def inform_warning(response)
|
113
|
+
return unless (warning = response&.headers&.dig('warning'))
|
114
|
+
@logger.warn("ES|QL executor received warning", { message: warning })
|
115
|
+
end
|
116
|
+
|
117
|
+
# Transforms dotted keys to nested JSON shape
|
118
|
+
# @param dot_keyed_hash [Hash] whose keys are dotted (example 'a.b.c.d': 'val')
|
119
|
+
# @return [Hash] whose keys are nested with value mapped ({'a':{'b':{'c':{'d':'val'}}}})
|
120
|
+
def generate_nested_structure(dot_keyed_hash)
|
121
|
+
dot_keyed_hash.each_with_object({}) do |(key, value), result|
|
122
|
+
key_parts = key.to_s.split('.')
|
123
|
+
*path, leaf = key_parts
|
124
|
+
leaf_scope = path.inject(result) { |scope, part| scope[part] ||= {} }
|
125
|
+
leaf_scope[leaf] = value
|
126
|
+
end
|
127
|
+
end
|
128
|
+
|
129
|
+
# Determines whether each column in a collection is a nested sub-element (e.g "user.age")
|
130
|
+
# of another column in the same collection (e.g "user").
|
131
|
+
#
|
132
|
+
# @param columns [Array<ColumnSpec>] An array of objects with a `name` attribute representing field paths.
|
133
|
+
# @return [Hash<ColumnSpec, Boolean>] A hash mapping each column to `true` if it is a sub-element of another field, `false` otherwise.
|
134
|
+
# Time complexity: (O(NlogN+N*K)) where K is the number of conflict depth
|
135
|
+
# without (`prefix_set`) memoization, it would be O(N^2)
|
136
|
+
def mark_sub_elements(columns)
|
137
|
+
# Sort columns by name length (ascending)
|
138
|
+
sorted_columns = columns.sort_by { |c| c.name.length }
|
139
|
+
prefix_set = Set.new # memoization set
|
140
|
+
|
141
|
+
sorted_columns.each_with_object({}) do |column, memo|
|
142
|
+
# Split the column name into parts (e.g., "user.profile.age" → ["user", "profile", "age"])
|
143
|
+
parts = column.name.split('.')
|
144
|
+
|
145
|
+
# Generate all possible parent prefixes (e.g., "user", "user.profile")
|
146
|
+
# and check if any parent prefix exists in the set
|
147
|
+
parent_prefixes = (0...parts.size - 1).map { |i| parts[0..i].join('.') }
|
148
|
+
memo[column] = parent_prefixes.any? { |prefix| prefix_set.include?(prefix) }
|
149
|
+
prefix_set.add(column.name)
|
150
|
+
end
|
151
|
+
end
|
152
|
+
end
|
153
|
+
|
154
|
+
# Class representing a column specification in the ESQL response['columns']
|
155
|
+
# The class's main purpose is to provide a structure for the event key
|
156
|
+
# columns is an array with `name` and `type` pair (example: `{"name"=>"@timestamp", "type"=>"date"}`)
|
157
|
+
# @attr_reader :name [String] The name of the column
|
158
|
+
# @attr_reader :type [String] The type of the column
|
159
|
+
class ColumnSpec
|
160
|
+
attr_reader :name, :type
|
161
|
+
|
162
|
+
def initialize(spec)
|
163
|
+
@name = isolate(spec.fetch('name'))
|
164
|
+
@type = isolate(spec.fetch('type'))
|
165
|
+
end
|
166
|
+
|
167
|
+
def field_reference
|
168
|
+
@_field_reference ||= '[' + name.gsub('.', '][') + ']'
|
169
|
+
end
|
170
|
+
|
171
|
+
private
|
172
|
+
def isolate(value)
|
173
|
+
value.frozen? ? value : value.clone.freeze
|
174
|
+
end
|
175
|
+
end
|
176
|
+
end
|
177
|
+
end
|
178
|
+
end
|