logstash-filter-elasticsearch 3.18.0 → 3.19.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +3 -0
- data/docs/index.asciidoc +132 -6
- data/lib/logstash/filters/elasticsearch/client.rb +8 -0
- data/lib/logstash/filters/elasticsearch/dsl_executor.rb +140 -0
- data/lib/logstash/filters/elasticsearch/esql_executor.rb +178 -0
- data/lib/logstash/filters/elasticsearch.rb +106 -129
- data/logstash-filter-elasticsearch.gemspec +1 -1
- data/spec/filters/elasticsearch_dsl_spec.rb +372 -0
- data/spec/filters/elasticsearch_esql_spec.rb +211 -0
- data/spec/filters/elasticsearch_spec.rb +129 -326
- data/spec/filters/integration/elasticsearch_esql_spec.rb +167 -0
- metadata +10 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 28c0544d8e61fe99078cabecc2ce4cf898aefc35775e80a71a5bb34221d534d5
|
4
|
+
data.tar.gz: 6f42fd12be09e9aa4b619c95ffff8cc9dba427f5f71f915d2e7c30a583f83fec
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 3aaa9a9b8b31e1782e927fe4bccb26fd28063206dfeaf00c574657e0d91e3c7c7544565c657a65d37a5e164c0e2477b8fdab3f4dfb2c7feb79971962e94cbb87
|
7
|
+
data.tar.gz: 339a881e589fe44d5657e4a4417f9e72f38921a08ff99b2cbc1534785fcddb00ae484960311a70d653f3ca058f38ceed0a290d789bb0d728d7db1ff8de8a6d46
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,6 @@
|
|
1
|
+
## 3.19.0
|
2
|
+
- ES|QL support [#199](https://github.com/logstash-plugins/logstash-filter-elasticsearch/pull/199)
|
3
|
+
|
1
4
|
## 3.18.0
|
2
5
|
- Add `target` configuration option to store the result into it [#197](https://github.com/logstash-plugins/logstash-filter-elasticsearch/pull/197)
|
3
6
|
|
data/docs/index.asciidoc
CHANGED
@@ -54,7 +54,7 @@ if [type] == "end" {
|
|
54
54
|
|
55
55
|
The example below reproduces the above example but utilises the query_template.
|
56
56
|
This query_template represents a full Elasticsearch query DSL and supports the
|
57
|
-
standard
|
57
|
+
standard {ls} field substitution syntax. The example below issues
|
58
58
|
the same query as the first example but uses the template shown.
|
59
59
|
|
60
60
|
[source,ruby]
|
@@ -118,6 +118,110 @@ Authentication to a secure Elasticsearch cluster is possible using _one_ of the
|
|
118
118
|
Authorization to a secure Elasticsearch cluster requires `read` permission at index level and `monitoring` permissions at cluster level.
|
119
119
|
The `monitoring` permission at cluster level is necessary to perform periodic connectivity checks.
|
120
120
|
|
121
|
+
[id="plugins-{type}s-{plugin}-esql"]
|
122
|
+
==== {esql} support
|
123
|
+
|
124
|
+
.Technical Preview
|
125
|
+
****
|
126
|
+
The {esql} feature that allows using ES|QL queries with this plugin is in Technical Preview.
|
127
|
+
Configuration options and implementation details are subject to change in minor releases without being preceded by deprecation warnings.
|
128
|
+
****
|
129
|
+
|
130
|
+
{es} Query Language ({esql}) provides a SQL-like interface for querying your {es} data.
|
131
|
+
|
132
|
+
To use {esql}, this plugin needs to be installed in {ls} 8.17.4 or newer, and must be connected to {es} 8.11 or newer.
|
133
|
+
|
134
|
+
To configure {esql} query in the plugin, set your {esql} query in the `query` parameter.
|
135
|
+
|
136
|
+
IMPORTANT: We recommend understanding {ref}/esql-limitations.html[{esql} current limitations] before using it in production environments.
|
137
|
+
|
138
|
+
The following is a basic {esql} query that sets the food name to transaction event based on upstream event's food ID:
|
139
|
+
[source, ruby]
|
140
|
+
filter {
|
141
|
+
elasticsearch {
|
142
|
+
hosts => [ 'https://..']
|
143
|
+
api_key => '....'
|
144
|
+
query => '
|
145
|
+
FROM food-index
|
146
|
+
| WHERE id == ?food_id
|
147
|
+
'
|
148
|
+
query_params => {
|
149
|
+
"food_id" => "[food][id]"
|
150
|
+
}
|
151
|
+
}
|
152
|
+
}
|
153
|
+
|
154
|
+
Set `config.support_escapes: true` in `logstash.yml` if you need to escape special chars in the query.
|
155
|
+
|
156
|
+
In the result event, the plugin sets total result size in `[@metadata][total_values]` field.
|
157
|
+
|
158
|
+
[id="plugins-{type}s-{plugin}-esql-event-mapping"]
|
159
|
+
===== Mapping {esql} result to {ls} event
|
160
|
+
{esql} returns query results in a structured tabular format, where data is organized into _columns_ (fields) and _values_ (entries).
|
161
|
+
The plugin maps each value entry to an event, populating corresponding fields.
|
162
|
+
For example, a query might produce a table like:
|
163
|
+
|
164
|
+
[cols="2,1,1,1,2",options="header"]
|
165
|
+
|===
|
166
|
+
|`timestamp` |`user_id` | `action` | `status.code` | `status.desc`
|
167
|
+
|
168
|
+
|2025-04-10T12:00:00 |123 |login |200 | Success
|
169
|
+
|2025-04-10T12:05:00 |456 |purchase |403 | Forbidden (unauthorized user)
|
170
|
+
|===
|
171
|
+
|
172
|
+
For this case, the plugin creates two JSON look like objects as below and places them into the `target` field of the event if `target` is defined.
|
173
|
+
If `target` is not defined, the plugin places the _only_ first result at the root of the event.
|
174
|
+
[source, json]
|
175
|
+
[
|
176
|
+
{
|
177
|
+
"timestamp": "2025-04-10T12:00:00",
|
178
|
+
"user_id": 123,
|
179
|
+
"action": "login",
|
180
|
+
"status": {
|
181
|
+
"code": 200,
|
182
|
+
"desc": "Success"
|
183
|
+
}
|
184
|
+
},
|
185
|
+
{
|
186
|
+
"timestamp": "2025-04-10T12:05:00",
|
187
|
+
"user_id": 456,
|
188
|
+
"action": "purchase",
|
189
|
+
"status": {
|
190
|
+
"code": 403,
|
191
|
+
"desc": "Forbidden (unauthorized user)"
|
192
|
+
}
|
193
|
+
}
|
194
|
+
]
|
195
|
+
|
196
|
+
NOTE: If your index has a mapping with sub-objects where `status.code` and `status.desc` actually dotted fields, they appear in {ls} events as a nested structure.
|
197
|
+
|
198
|
+
[id="plugins-{type}s-{plugin}-esql-multifields"]
|
199
|
+
===== Conflict on multi-fields
|
200
|
+
|
201
|
+
{esql} query fetches all parent and sub-fields fields if your {es} index has https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/multi-fields[multi-fields] or https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/subobjects[subobjects].
|
202
|
+
Since {ls} events cannot contain parent field's concrete value and sub-field values together, the plugin ignores sub-fields with warning and includes parent.
|
203
|
+
We recommend using the `RENAME` (or `DROP` to avoid warning) keyword in your {esql} query explicitly rename the fields to include sub-fields into the event.
|
204
|
+
|
205
|
+
This is a common occurrence if your template or mapping follows the pattern of always indexing strings as "text" (`field`) + " keyword" (`field.keyword`) multi-field.
|
206
|
+
In this case it's recommended to do `KEEP field` if the string is identical and there is only one subfield as the engine will optimize and retrieve the keyword, otherwise you can do `KEEP field.keyword | RENAME field.keyword as field`.
|
207
|
+
|
208
|
+
To illustrate the situation with example, assuming your mapping has a time `time` field with `time.min` and `time.max` sub-fields as following:
|
209
|
+
[source, ruby]
|
210
|
+
"properties": {
|
211
|
+
"time": { "type": "long" },
|
212
|
+
"time.min": { "type": "long" },
|
213
|
+
"time.max": { "type": "long" }
|
214
|
+
}
|
215
|
+
|
216
|
+
The {esql} result will contain all three fields but the plugin cannot map them into {ls} event.
|
217
|
+
To avoid this, you can use the `RENAME` keyword to rename the `time` parent field to get all three fields with unique fields.
|
218
|
+
[source, ruby]
|
219
|
+
...
|
220
|
+
query => 'FROM my-index | RENAME time AS time.current'
|
221
|
+
...
|
222
|
+
|
223
|
+
For comprehensive ES|QL syntax reference and best practices, see the https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-syntax.html[{esql} documentation].
|
224
|
+
|
121
225
|
[id="plugins-{type}s-{plugin}-options"]
|
122
226
|
==== Elasticsearch Filter Configuration Options
|
123
227
|
|
@@ -140,6 +244,8 @@ This plugin supports the following configuration options plus the <<plugins-{typ
|
|
140
244
|
| <<plugins-{type}s-{plugin}-password>> |<<password,password>>|No
|
141
245
|
| <<plugins-{type}s-{plugin}-proxy>> |<<uri,uri>>|No
|
142
246
|
| <<plugins-{type}s-{plugin}-query>> |<<string,string>>|No
|
247
|
+
| <<plugins-{type}s-{plugin}-query_type>> |<<string,string>>, one of `["dsl", "esql"]`|No
|
248
|
+
| <<plugins-{type}s-{plugin}-query_params>> |<<hash,hash>> or <<hash,hash>>|No
|
143
249
|
| <<plugins-{type}s-{plugin}-query_template>> |<<string,string>>|No
|
144
250
|
| <<plugins-{type}s-{plugin}-result_size>> |<<number,number>>|No
|
145
251
|
| <<plugins-{type}s-{plugin}-retry_on_failure>> |<<number,number>>|No
|
@@ -337,11 +443,30 @@ environment variables e.g. `proxy => '${LS_PROXY:}'`.
|
|
337
443
|
* Value type is <<string,string>>
|
338
444
|
* There is no default value for this setting.
|
339
445
|
|
340
|
-
|
341
|
-
|
342
|
-
string
|
343
|
-
|
446
|
+
The query to be executed.
|
447
|
+
The accepted query shape is DSL query string or ES|QL.
|
448
|
+
For the DSL query string, use either `query` or `query_template`.
|
449
|
+
Read the {ref}/query-dsl-query-string-query.html[{es} query
|
450
|
+
string documentation] or {ref}/esql.html[{es} ES|QL documentation] for more information.
|
451
|
+
|
452
|
+
[id="plugins-{type}s-{plugin}-query_type"]
|
453
|
+
===== `query_type`
|
454
|
+
|
455
|
+
* Value can be `dsl` or `esql`
|
456
|
+
* Default value is `dsl`
|
457
|
+
|
458
|
+
Defines the <<plugins-{type}s-{plugin}-query>> shape.
|
459
|
+
When `dsl`, the query shape must be valid {es} JSON-style string.
|
460
|
+
When `esql`, the query shape must be a valid {esql} string and `index`, `query_template` and `sort` parameters are not allowed.
|
461
|
+
|
462
|
+
[id="plugins-{type}s-{plugin}-query_params"]
|
463
|
+
===== `query_params`
|
464
|
+
|
465
|
+
* The value type is <<hash,hash>> or <<array,array>>. When an array provided, the array elements are pairs of `key` and `value`.
|
466
|
+
* There is no default value for this setting
|
344
467
|
|
468
|
+
Named parameters in {esql} to send to {es} together with <<plugins-{type}s-{plugin}-query>>.
|
469
|
+
Visit {ref}/esql-rest.html#esql-rest-params[passing parameters to query page] for more information.
|
345
470
|
|
346
471
|
[id="plugins-{type}s-{plugin}-query_template"]
|
347
472
|
===== `query_template`
|
@@ -538,8 +663,9 @@ Tags the event on failure to look up previous log event information. This can be
|
|
538
663
|
|
539
664
|
Define the target field for placing the result data.
|
540
665
|
If this setting is omitted, the target will be the root (top level) of the event.
|
666
|
+
It is highly recommended to set when using `query_type=>'esql'` to set all query results into the event.
|
541
667
|
|
542
|
-
|
668
|
+
When `query_type=>'dsl'`, the destination fields specified in <<plugins-{type}s-{plugin}-fields>>, <<plugins-{type}s-{plugin}-aggregation_fields>>, and <<plugins-{type}s-{plugin}-docinfo_fields>> are relative to this target.
|
543
669
|
|
544
670
|
For example, if you want the data to be put in the `operation` field:
|
545
671
|
[source,ruby]
|
@@ -58,11 +58,19 @@ module LogStash
|
|
58
58
|
def search(params={})
|
59
59
|
@client.search(params)
|
60
60
|
end
|
61
|
+
|
62
|
+
def esql_query(params={})
|
63
|
+
@client.esql.query(params)
|
64
|
+
end
|
61
65
|
|
62
66
|
def info
|
63
67
|
@client.info
|
64
68
|
end
|
65
69
|
|
70
|
+
def es_version
|
71
|
+
info&.dig('version', 'number')
|
72
|
+
end
|
73
|
+
|
66
74
|
def build_flavor
|
67
75
|
@build_flavor ||= info&.dig('version', 'build_flavor')
|
68
76
|
end
|
@@ -0,0 +1,140 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
|
3
|
+
module LogStash
|
4
|
+
module Filters
|
5
|
+
class Elasticsearch
|
6
|
+
class DslExecutor
|
7
|
+
def initialize(plugin, logger)
|
8
|
+
@index = plugin.params["index"]
|
9
|
+
@query = plugin.params["query"]
|
10
|
+
@query_dsl = plugin.query_dsl
|
11
|
+
@fields = plugin.params["fields"]
|
12
|
+
@result_size = plugin.params["result_size"]
|
13
|
+
@docinfo_fields = plugin.params["docinfo_fields"]
|
14
|
+
@tag_on_failure = plugin.params["tag_on_failure"]
|
15
|
+
@enable_sort = plugin.params["enable_sort"]
|
16
|
+
@sort = plugin.params["sort"]
|
17
|
+
@aggregation_fields = plugin.params["aggregation_fields"]
|
18
|
+
@logger = logger
|
19
|
+
@event_decorator = plugin.method(:decorate)
|
20
|
+
@target_field = plugin.params["target"]
|
21
|
+
if @target_field
|
22
|
+
def self.apply_target(path); "[#{@target_field}][#{path}]"; end
|
23
|
+
else
|
24
|
+
def self.apply_target(path); path; end
|
25
|
+
end
|
26
|
+
end
|
27
|
+
|
28
|
+
def process(client, event)
|
29
|
+
matched = false
|
30
|
+
begin
|
31
|
+
params = { :index => event.sprintf(@index) }
|
32
|
+
|
33
|
+
if @query_dsl
|
34
|
+
query = LogStash::Json.load(event.sprintf(@query_dsl))
|
35
|
+
params[:body] = query
|
36
|
+
else
|
37
|
+
query = event.sprintf(@query)
|
38
|
+
params[:q] = query
|
39
|
+
params[:size] = @result_size
|
40
|
+
params[:sort] = @sort if @enable_sort
|
41
|
+
end
|
42
|
+
|
43
|
+
@logger.debug("Querying elasticsearch for lookup", :params => params)
|
44
|
+
|
45
|
+
results = client.search(params)
|
46
|
+
raise "Elasticsearch query error: #{results["_shards"]["failures"]}" if results["_shards"].include? "failures"
|
47
|
+
|
48
|
+
event.set("[@metadata][total_hits]", extract_total_from_hits(results['hits']))
|
49
|
+
|
50
|
+
result_hits = results["hits"]["hits"]
|
51
|
+
if !result_hits.nil? && !result_hits.empty?
|
52
|
+
matched = true
|
53
|
+
@fields.each do |old_key, new_key|
|
54
|
+
old_key_path = extract_path(old_key)
|
55
|
+
extracted_hit_values = result_hits.map do |doc|
|
56
|
+
extract_value(doc["_source"], old_key_path)
|
57
|
+
end
|
58
|
+
value_to_set = extracted_hit_values.count > 1 ? extracted_hit_values : extracted_hit_values.first
|
59
|
+
set_to_event_target(event, new_key, value_to_set)
|
60
|
+
end
|
61
|
+
@docinfo_fields.each do |old_key, new_key|
|
62
|
+
old_key_path = extract_path(old_key)
|
63
|
+
extracted_docs_info = result_hits.map do |doc|
|
64
|
+
extract_value(doc, old_key_path)
|
65
|
+
end
|
66
|
+
value_to_set = extracted_docs_info.count > 1 ? extracted_docs_info : extracted_docs_info.first
|
67
|
+
set_to_event_target(event, new_key, value_to_set)
|
68
|
+
end
|
69
|
+
end
|
70
|
+
|
71
|
+
result_aggregations = results["aggregations"]
|
72
|
+
if !result_aggregations.nil? && !result_aggregations.empty?
|
73
|
+
matched = true
|
74
|
+
@aggregation_fields.each do |agg_name, ls_field|
|
75
|
+
set_to_event_target(event, ls_field, result_aggregations[agg_name])
|
76
|
+
end
|
77
|
+
end
|
78
|
+
|
79
|
+
rescue => e
|
80
|
+
if @logger.trace?
|
81
|
+
@logger.warn("Failed to query elasticsearch for previous event", :index => @index, :query => @query, :event => event.to_hash, :error => e.message, :backtrace => e.backtrace)
|
82
|
+
elsif @logger.debug?
|
83
|
+
@logger.warn("Failed to query elasticsearch for previous event", :index => @index, :error => e.message, :backtrace => e.backtrace)
|
84
|
+
else
|
85
|
+
@logger.warn("Failed to query elasticsearch for previous event", :index => @index, :error => e.message)
|
86
|
+
end
|
87
|
+
@tag_on_failure.each { |tag| event.tag(tag) }
|
88
|
+
else
|
89
|
+
@event_decorator.call(event) if matched
|
90
|
+
end
|
91
|
+
end
|
92
|
+
|
93
|
+
private
|
94
|
+
|
95
|
+
# Given a "hits" object from an Elasticsearch response, return the total number of hits in
|
96
|
+
# the result set.
|
97
|
+
# @param hits [Hash{String=>Object}]
|
98
|
+
# @return [Integer]
|
99
|
+
def extract_total_from_hits(hits)
|
100
|
+
total = hits['total']
|
101
|
+
|
102
|
+
# Elasticsearch 7.x produces an object containing `value` and `relation` in order
|
103
|
+
# to enable unambiguous reporting when the total is only a lower bound; if we get
|
104
|
+
# an object back, return its `value`.
|
105
|
+
return total['value'] if total.kind_of?(Hash)
|
106
|
+
total
|
107
|
+
end
|
108
|
+
|
109
|
+
# get an array of path elements from a path reference
|
110
|
+
def extract_path(path_reference)
|
111
|
+
return [path_reference] unless path_reference.start_with?('[') && path_reference.end_with?(']')
|
112
|
+
|
113
|
+
path_reference[1...-1].split('][')
|
114
|
+
end
|
115
|
+
|
116
|
+
# given a Hash and an array of path fragments, returns the value at the path
|
117
|
+
# @param source [Hash{String=>Object}]
|
118
|
+
# @param path [Array{String}]
|
119
|
+
# @return [Object]
|
120
|
+
def extract_value(source, path)
|
121
|
+
path.reduce(source) do |memo, old_key_fragment|
|
122
|
+
break unless memo.include?(old_key_fragment)
|
123
|
+
memo[old_key_fragment]
|
124
|
+
end
|
125
|
+
end
|
126
|
+
|
127
|
+
# if @target is defined, creates a nested structure to inject result into target field
|
128
|
+
# if not defined, directly sets to the top-level event field
|
129
|
+
# @param event [LogStash::Event]
|
130
|
+
# @param new_key [String] name of the field to set
|
131
|
+
# @param value_to_set [Array] values to set
|
132
|
+
# @return [void]
|
133
|
+
def set_to_event_target(event, new_key, value_to_set)
|
134
|
+
key_to_set = self.apply_target(new_key)
|
135
|
+
event.set(key_to_set, value_to_set)
|
136
|
+
end
|
137
|
+
end
|
138
|
+
end
|
139
|
+
end
|
140
|
+
end
|
@@ -0,0 +1,178 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
|
3
|
+
module LogStash
|
4
|
+
module Filters
|
5
|
+
class Elasticsearch
|
6
|
+
class EsqlExecutor
|
7
|
+
|
8
|
+
ESQL_PARSERS_BY_TYPE = Hash.new(lambda { |x| x }).merge(
|
9
|
+
'date' => ->(value) { value && LogStash::Timestamp.new(value) },
|
10
|
+
)
|
11
|
+
|
12
|
+
def initialize(plugin, logger)
|
13
|
+
@logger = logger
|
14
|
+
|
15
|
+
@event_decorator = plugin.method(:decorate)
|
16
|
+
@query = plugin.params["query"]
|
17
|
+
|
18
|
+
query_params = plugin.query_params || {}
|
19
|
+
reference_valued_params, static_valued_params = query_params.partition { |_, v| v.kind_of?(String) && v.match?(/^\[.*\]$/) }
|
20
|
+
@referenced_params = reference_valued_params&.to_h
|
21
|
+
# keep static params as an array of hashes to attach to the ES|QL api param easily
|
22
|
+
@static_params = static_valued_params.map { |k, v| { k => v } }
|
23
|
+
@tag_on_failure = plugin.params["tag_on_failure"]
|
24
|
+
@logger.debug("ES|QL query executor initialized with ", query: @query, query_params: query_params)
|
25
|
+
|
26
|
+
# if the target is specified, all result entries will be copied to the target field
|
27
|
+
# otherwise, the first value of the result will be copied to the event
|
28
|
+
@target_field = plugin.params["target"]
|
29
|
+
@logger.warn("Only first query result will be copied to the event. Please specify `target` in plugin config to include all") if @target_field.nil?
|
30
|
+
end
|
31
|
+
|
32
|
+
def process(client, event)
|
33
|
+
resolved_params = @referenced_params&.any? ? resolve_parameters(event) : []
|
34
|
+
resolved_params.concat(@static_params) if @static_params&.any?
|
35
|
+
response = execute_query(client, resolved_params)
|
36
|
+
inform_warning(response)
|
37
|
+
process_response(event, response)
|
38
|
+
@event_decorator.call(event)
|
39
|
+
rescue => e
|
40
|
+
@logger.error("Failed to process ES|QL filter", exception: e)
|
41
|
+
@tag_on_failure.each { |tag| event.tag(tag) }
|
42
|
+
end
|
43
|
+
|
44
|
+
private
|
45
|
+
|
46
|
+
def resolve_parameters(event)
|
47
|
+
@referenced_params.map do |key, value|
|
48
|
+
begin
|
49
|
+
resolved_value = event.get(value)
|
50
|
+
@logger.debug("Resolved value for #{key}: #{resolved_value}, its class: #{resolved_value.class}")
|
51
|
+
{ key => resolved_value }
|
52
|
+
rescue => e
|
53
|
+
# catches invalid field reference
|
54
|
+
raise "Failed to resolve parameter `#{key}` with `#{value}`. Error: #{e.message}"
|
55
|
+
end
|
56
|
+
end
|
57
|
+
end
|
58
|
+
|
59
|
+
def execute_query(client, params)
|
60
|
+
# debug logs may help to check what query shape the plugin is sending to ES
|
61
|
+
@logger.debug("Executing ES|QL query", query: @query, params: params)
|
62
|
+
client.esql_query({ body: { query: @query, params: params }, format: 'json', drop_null_columns: true })
|
63
|
+
end
|
64
|
+
|
65
|
+
def process_response(event, response)
|
66
|
+
columns = response['columns']&.freeze || []
|
67
|
+
values = response['values']&.freeze || []
|
68
|
+
if values.nil? || values.size == 0
|
69
|
+
@logger.debug("Empty ES|QL query result", columns: columns, values: values)
|
70
|
+
return
|
71
|
+
end
|
72
|
+
|
73
|
+
# this shouldn't happen but just in case to avoid crashes the plugin
|
74
|
+
if columns.nil? || columns.size == 0
|
75
|
+
@logger.error("No columns exist but received values", columns: columns, values: values)
|
76
|
+
return
|
77
|
+
end
|
78
|
+
|
79
|
+
event.set("[@metadata][total_values]", values.size)
|
80
|
+
@logger.debug("ES|QL query result values size ", size: values.size)
|
81
|
+
|
82
|
+
column_specs = columns.map { |column| ColumnSpec.new(column) }
|
83
|
+
sub_element_mark_map = mark_sub_elements(column_specs)
|
84
|
+
multi_fields = sub_element_mark_map.filter_map { |key, val| key.name if val == true }
|
85
|
+
|
86
|
+
@logger.debug("Multi-fields found in ES|QL result and they will not be available in the event. Please use `RENAME` command if you want to include them.", { :detected_multi_fields => multi_fields }) if multi_fields.any?
|
87
|
+
|
88
|
+
if @target_field
|
89
|
+
values_to_set = values.map do |row|
|
90
|
+
mapped_data = column_specs.each_with_index.with_object({}) do |(column, index), mapped_data|
|
91
|
+
# `unless value.nil?` is a part of `drop_null_columns` that if some of the columns' values are not `nil`, `nil` values appear,
|
92
|
+
# we should continuously filter them out to achieve full `drop_null_columns` on each individual row (ideal `LIMIT 1` result)
|
93
|
+
# we also exclude sub-elements of the base field
|
94
|
+
if row[index] && sub_element_mark_map[column] == false
|
95
|
+
value_to_set = ESQL_PARSERS_BY_TYPE[column.type].call(row[index])
|
96
|
+
mapped_data[column.name] = value_to_set
|
97
|
+
end
|
98
|
+
end
|
99
|
+
generate_nested_structure(mapped_data) unless mapped_data.empty?
|
100
|
+
end
|
101
|
+
event.set("[#{@target_field}]", values_to_set)
|
102
|
+
else
|
103
|
+
column_specs.zip(values.first).each do |(column, value) |
|
104
|
+
if value && sub_element_mark_map[column] == false
|
105
|
+
value_to_set = ESQL_PARSERS_BY_TYPE[column.type].call(value)
|
106
|
+
event.set(column.field_reference, value_to_set)
|
107
|
+
end
|
108
|
+
end
|
109
|
+
end
|
110
|
+
end
|
111
|
+
|
112
|
+
def inform_warning(response)
|
113
|
+
return unless (warning = response&.headers&.dig('warning'))
|
114
|
+
@logger.warn("ES|QL executor received warning", { message: warning })
|
115
|
+
end
|
116
|
+
|
117
|
+
# Transforms dotted keys to nested JSON shape
|
118
|
+
# @param dot_keyed_hash [Hash] whose keys are dotted (example 'a.b.c.d': 'val')
|
119
|
+
# @return [Hash] whose keys are nested with value mapped ({'a':{'b':{'c':{'d':'val'}}}})
|
120
|
+
def generate_nested_structure(dot_keyed_hash)
|
121
|
+
dot_keyed_hash.each_with_object({}) do |(key, value), result|
|
122
|
+
key_parts = key.to_s.split('.')
|
123
|
+
*path, leaf = key_parts
|
124
|
+
leaf_scope = path.inject(result) { |scope, part| scope[part] ||= {} }
|
125
|
+
leaf_scope[leaf] = value
|
126
|
+
end
|
127
|
+
end
|
128
|
+
|
129
|
+
# Determines whether each column in a collection is a nested sub-element (e.g "user.age")
|
130
|
+
# of another column in the same collection (e.g "user").
|
131
|
+
#
|
132
|
+
# @param columns [Array<ColumnSpec>] An array of objects with a `name` attribute representing field paths.
|
133
|
+
# @return [Hash<ColumnSpec, Boolean>] A hash mapping each column to `true` if it is a sub-element of another field, `false` otherwise.
|
134
|
+
# Time complexity: (O(NlogN+N*K)) where K is the number of conflict depth
|
135
|
+
# without (`prefix_set`) memoization, it would be O(N^2)
|
136
|
+
def mark_sub_elements(columns)
|
137
|
+
# Sort columns by name length (ascending)
|
138
|
+
sorted_columns = columns.sort_by { |c| c.name.length }
|
139
|
+
prefix_set = Set.new # memoization set
|
140
|
+
|
141
|
+
sorted_columns.each_with_object({}) do |column, memo|
|
142
|
+
# Split the column name into parts (e.g., "user.profile.age" → ["user", "profile", "age"])
|
143
|
+
parts = column.name.split('.')
|
144
|
+
|
145
|
+
# Generate all possible parent prefixes (e.g., "user", "user.profile")
|
146
|
+
# and check if any parent prefix exists in the set
|
147
|
+
parent_prefixes = (0...parts.size - 1).map { |i| parts[0..i].join('.') }
|
148
|
+
memo[column] = parent_prefixes.any? { |prefix| prefix_set.include?(prefix) }
|
149
|
+
prefix_set.add(column.name)
|
150
|
+
end
|
151
|
+
end
|
152
|
+
end
|
153
|
+
|
154
|
+
# Class representing a column specification in the ESQL response['columns']
|
155
|
+
# The class's main purpose is to provide a structure for the event key
|
156
|
+
# columns is an array with `name` and `type` pair (example: `{"name"=>"@timestamp", "type"=>"date"}`)
|
157
|
+
# @attr_reader :name [String] The name of the column
|
158
|
+
# @attr_reader :type [String] The type of the column
|
159
|
+
class ColumnSpec
|
160
|
+
attr_reader :name, :type
|
161
|
+
|
162
|
+
def initialize(spec)
|
163
|
+
@name = isolate(spec.fetch('name'))
|
164
|
+
@type = isolate(spec.fetch('type'))
|
165
|
+
end
|
166
|
+
|
167
|
+
def field_reference
|
168
|
+
@_field_reference ||= '[' + name.gsub('.', '][') + ']'
|
169
|
+
end
|
170
|
+
|
171
|
+
private
|
172
|
+
def isolate(value)
|
173
|
+
value.frozen? ? value : value.clone.freeze
|
174
|
+
end
|
175
|
+
end
|
176
|
+
end
|
177
|
+
end
|
178
|
+
end
|