logstash-filter-augment 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +88 -10
- data/lib/logstash/filters/augment.rb +9 -2
- data/logstash-filter-augment.gemspec +1 -1
- data/spec/filters/augment_spec.rb +19 -0
- data/spec/fixtures/test-with-tabs.txt +2 -0
- metadata +4 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 2a72a388e9e7ae66e9f689bd614d6917d2796c78
|
4
|
+
data.tar.gz: bcdfce52b299a4187fe846e7d8cd2e385b0ead28
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 355a1d212bccd9c4af829d45c6e1e73c0395ccf33de9dbb4a20f7f5697d9e2aaf4b816dc1d0580e82cc4637710deddb7cb7acb8db7413a948c80fb8f27e0a017
|
7
|
+
data.tar.gz: d8c07568b6b23f336065d0a1860f4d6fc83b59bb1c0ce4c17749cc82a22eb601d88301dad8279eeaea8c06f8ca7ad26fcdf4f4fd30d0a51880cd3195b8648241
|
data/README.md
CHANGED
@@ -8,11 +8,46 @@ It can be used to augment events in logstash from config, CSV file, JSON file, o
|
|
8
8
|
|
9
9
|
## Documentation
|
10
10
|
|
11
|
-
|
11
|
+
logstash-filter-augment is a logstash plugin for augmenting events with data from a config file or exteral file (in CSV, JSON, or YAML format). The filter takes a `field` parameter that specifies what is being looked up. Based on configuration, it will find the object that is referred to and add the fields of that object to your event.
|
12
|
+
|
13
|
+
In the case of a CSV file, you'll want to specify the `csv_key` to tell it which field of the file is the key (it'll default to the first column in the CSV if you don't specify). If your CSV file doesn't contain a header row, you'll need to set the `csv_header` to be an array of the column names. If you do have a header, you can still specify the `csv_header`, but be sure to also specify that you want to `csv_first_line => ignore`.
|
14
|
+
|
15
|
+
In the case of JSON, you can provide a simple dictionary that maps the keys to the objects:
|
16
|
+
```json
|
17
|
+
{
|
18
|
+
"200": { "color": "green", "message": "ok" }
|
19
|
+
}
|
20
|
+
```
|
21
|
+
or in Array format:
|
22
|
+
```json
|
23
|
+
[
|
24
|
+
{"code": 200, "color": "green", "message": "ok"}
|
25
|
+
]
|
26
|
+
```
|
27
|
+
but then you'll have to provide a `json_key => "code"` parameter in your config file to let it know which field you want to use for lookups.
|
28
|
+
|
29
|
+
YAML works the same as JSON -- you can specify either a dictionary or an array:
|
30
|
+
```yaml
|
31
|
+
200:
|
32
|
+
color: green
|
33
|
+
message: ok
|
34
|
+
404:
|
35
|
+
color: red
|
36
|
+
message: not found
|
37
|
+
```
|
38
|
+
or
|
39
|
+
```yaml
|
40
|
+
- code: 200
|
41
|
+
color: green
|
42
|
+
message: ok
|
43
|
+
- code: 404
|
44
|
+
color: red
|
45
|
+
message: not found
|
46
|
+
```
|
47
|
+
but again, you'll need to specify the `yaml_key => "code"`
|
48
|
+
|
49
|
+
Finally you can configure logstash-filter-augment statically with a dictionary:
|
12
50
|
```ruby
|
13
|
-
filter {
|
14
|
-
augment {
|
15
|
-
field => "status"
|
16
51
|
dictionary => {
|
17
52
|
"200" => {
|
18
53
|
"color" => "green"
|
@@ -23,17 +58,60 @@ filter {
|
|
23
58
|
"message" => "Missing"
|
24
59
|
}
|
25
60
|
}
|
26
|
-
|
61
|
+
default => {
|
27
62
|
"color" => "orange"
|
28
63
|
"message" => "not found"
|
29
64
|
}
|
30
65
|
}
|
31
|
-
}
|
32
66
|
```
|
33
|
-
|
34
|
-
|
35
|
-
|
36
|
-
|
67
|
+
If you choose this route, be careful that you quote your keys or you could end up with weird logstash errors.
|
68
|
+
### config parameters
|
69
|
+
| parameter | required (default)| Description |
|
70
|
+
| --------- |:---:| ---|
|
71
|
+
| field | Yes | the field of the event to look up in the dictionary |
|
72
|
+
| dictionary_path | Yes if `dictionary` isn't provided | The list of files to load |
|
73
|
+
| dictionary_type | No (auto) | The type of files provided on dictionary_path. Allowed values are `auto`, `csv`, `json`, `yaml`, and `yml` |
|
74
|
+
| dictionary | Yes if `dictionary_path` isn't provided | A dictionary to use. See example above |
|
75
|
+
| csv_header | No | The header fields of the csv_file |
|
76
|
+
| csv_first_line | No (auto) | indicates what to do with the first line of the file. Valid values are `ignore`, `header`, `data`, and `auto`. `auto` treats the first line as data if csv_header is set or `header` if it isn't |
|
77
|
+
| csv_key | No | On CSV files, which field name is the key. Defaults to the first column of the file if not set |
|
78
|
+
| csv_remove_key | No(true) | Remove the key from the object. You might want to set this to false if you don't have a `default` set so that you know which records were matched |
|
79
|
+
| csv_col_sep | No(,) | Change the column seperator for CSV files. If you need to use tabs, you have to embed a real tab in the quotes |
|
80
|
+
| csv_quote_char | No(") | Change the quote character for CSV files |
|
81
|
+
| json_key | Yes, if array | The field of the json objects to use as a key for the dictionary |
|
82
|
+
| json_remove_key | No | Similar to csv_remove_key |
|
83
|
+
| yaml_key | Yes, if array | The field of the YAML objects to use as a key for the dictionary |
|
84
|
+
| yaml_remove_key | No | Similar to csv_remove_key |
|
85
|
+
| augment_fields | No (all fields) | The fields to copy from the object to the target. If this is specified, only these fields will be copied. |
|
86
|
+
| ignore_fields | No | If this list is specified and `augment_fields` isn't, then these fields will not be copied |
|
87
|
+
| default | No | A dictionary of fields to add to the target if the key isn't in the data |
|
88
|
+
| target | No ("") | Where to target the fields. If this is left as the default "", it targets the event itself. Otherwise you can specify a valid event selector. For example, [user][location] Would set user.location.{fields from object} |
|
89
|
+
| refresh_interval | No (60) | The number of seconds between checking to see if the file has been modified. Set to -1 to disable checking, set to 0 to check on every event (not recommended)|
|
90
|
+
|
91
|
+
## Use Cases
|
92
|
+
### Geocoding by key
|
93
|
+
If you have a field that can be used to lookup a location and you have a location file, you could configure this way:
|
94
|
+
```ruby
|
95
|
+
augment {
|
96
|
+
field => "store"
|
97
|
+
target => "[location]"
|
98
|
+
dictionary_path => "geocode.csv"
|
99
|
+
csv_header => ["id","lat","lon"]
|
100
|
+
csv_key => "id"
|
101
|
+
csv_first_line => "data"
|
102
|
+
}
|
103
|
+
```
|
104
|
+
and then be sure that your mapping / mapping template changes "location" into a geo_point
|
105
|
+
### Attach multiple pieces of user data based on user key
|
106
|
+
```ruby
|
107
|
+
augment {
|
108
|
+
field => "username"
|
109
|
+
dictionary_path => ["users1.csv", "users2.csv"]
|
110
|
+
csv_header => ["username","fullName","address1","address2","city","state","zipcode"]
|
111
|
+
csv_key => "id"
|
112
|
+
csv_first_line => "ignore"
|
113
|
+
}
|
114
|
+
```
|
37
115
|
## Developing
|
38
116
|
|
39
117
|
### 1. Plugin Developement and Testing
|
@@ -59,7 +59,7 @@ class LogStash::Filters::Augment < LogStash::Filters::Base
|
|
59
59
|
# - 'ignore' skips it (csv_header must be set)
|
60
60
|
# - 'header' reads it and populates csv_header with it (csv_header must not be set)
|
61
61
|
# - 'data' reads it as data (csv_header must be set)
|
62
|
-
# - 'auto' treats the first line as data if csv_header is set or header if
|
62
|
+
# - 'auto' treats the first line as `data` if csv_header is set or `header` if it isn't
|
63
63
|
config :csv_first_line, :validate => ["data","header","ignore","auto"], :default=>"auto"
|
64
64
|
# the csv_key determines which field of the csv file is the dictionary key
|
65
65
|
# if this is not set, it will default to first column of the csv file
|
@@ -70,6 +70,10 @@ class LogStash::Filters::Augment < LogStash::Filters::Base
|
|
70
70
|
# is false then the event will have a status=200. If csv_remove_key is true, then the event won't have
|
71
71
|
# a status unless it already existed in the event.
|
72
72
|
config :csv_remove_key, :validate => :boolean, :default => true
|
73
|
+
# the column seperator for a CSV file
|
74
|
+
config :csv_col_sep, :validate => :string, :default => ","
|
75
|
+
# the quote character for a CSV file
|
76
|
+
config :csv_quote_char, :validate => :string, :default => '"'
|
73
77
|
# if the json file provided is an array, this specifies which field of the
|
74
78
|
# array of objects is the key value
|
75
79
|
config :json_key, :validate => :string
|
@@ -265,7 +269,7 @@ private
|
|
265
269
|
raise LogStash::ConfigurationError, "The csv_first_line is set to 'ignore' but csv_header is not set"
|
266
270
|
end
|
267
271
|
end
|
268
|
-
csv_lines = CSV.read(filename);
|
272
|
+
csv_lines = CSV.read(filename,{ :col_sep => @csv_col_sep, :quote_char => @csv_quote_char });
|
269
273
|
if @csv_first_line == 'header'
|
270
274
|
@csv_header = csv_lines.shift
|
271
275
|
elsif @csv_first_line == 'ignore'
|
@@ -316,6 +320,9 @@ private
|
|
316
320
|
if ! @dictionaries
|
317
321
|
return
|
318
322
|
end
|
323
|
+
if @refresh_interval < 0 && @dictionary_mtime # don't refresh if we aren't supposed to
|
324
|
+
return
|
325
|
+
end
|
319
326
|
if (@next_refresh && @next_refresh + @refresh_interval < Time.now)
|
320
327
|
return
|
321
328
|
end
|
@@ -1,6 +1,6 @@
|
|
1
1
|
Gem::Specification.new do |s|
|
2
2
|
s.name = 'logstash-filter-augment'
|
3
|
-
s.version = '0.
|
3
|
+
s.version = '0.2.0'
|
4
4
|
s.licenses = ['Apache License (2.0)']
|
5
5
|
s.summary = 'A logstash plugin to augment your events from data in files'
|
6
6
|
s.description = 'A logstash plugin that can merge data from CSV, YAML, and JSON files with events.'
|
@@ -69,6 +69,25 @@ describe LogStash::Filters::Augment do
|
|
69
69
|
expect { subject }.to raise_exception LogStash::ConfigurationError
|
70
70
|
end
|
71
71
|
end
|
72
|
+
describe "csv file with a options set" do
|
73
|
+
filename = File.join(File.dirname(__FILE__), "..", "fixtures", "test-with-tabs.txt")
|
74
|
+
config <<-CONFIG
|
75
|
+
filter {
|
76
|
+
augment {
|
77
|
+
field => "status"
|
78
|
+
dictionary_path => '#{filename}'
|
79
|
+
dictionary_type => "csv"
|
80
|
+
csv_first_line => "data"
|
81
|
+
csv_header => ["status","color","message"]
|
82
|
+
csv_col_sep => " "
|
83
|
+
}
|
84
|
+
}
|
85
|
+
CONFIG
|
86
|
+
sample("status" => "200") do
|
87
|
+
insist { subject.get("color")} == "green"
|
88
|
+
insist { subject.get("message")} == "ok"
|
89
|
+
end
|
90
|
+
end
|
72
91
|
describe "simple csv file with header ignored" do
|
73
92
|
filename = File.join(File.dirname(__FILE__), "..", "fixtures", "test-with-headers.csv")
|
74
93
|
config <<-CONFIG
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: logstash-filter-augment
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Adam Caldwell
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2017-
|
11
|
+
date: 2017-02-13 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
requirement: !ruby/object:Gem::Requirement
|
@@ -56,6 +56,7 @@ files:
|
|
56
56
|
- spec/fixtures/json-array.json
|
57
57
|
- spec/fixtures/json-hash.json
|
58
58
|
- spec/fixtures/test-with-headers.csv
|
59
|
+
- spec/fixtures/test-with-tabs.txt
|
59
60
|
- spec/fixtures/test-without-headers.csv
|
60
61
|
- spec/fixtures/yaml-array.yaml
|
61
62
|
- spec/fixtures/yaml-object.yaml
|
@@ -92,6 +93,7 @@ test_files:
|
|
92
93
|
- spec/fixtures/json-array.json
|
93
94
|
- spec/fixtures/json-hash.json
|
94
95
|
- spec/fixtures/test-with-headers.csv
|
96
|
+
- spec/fixtures/test-with-tabs.txt
|
95
97
|
- spec/fixtures/test-without-headers.csv
|
96
98
|
- spec/fixtures/yaml-array.yaml
|
97
99
|
- spec/fixtures/yaml-object.yaml
|