logstash-filter-augment 0.1.0 → 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +88 -10
- data/lib/logstash/filters/augment.rb +9 -2
- data/logstash-filter-augment.gemspec +1 -1
- data/spec/filters/augment_spec.rb +19 -0
- data/spec/fixtures/test-with-tabs.txt +2 -0
- metadata +4 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 2a72a388e9e7ae66e9f689bd614d6917d2796c78
|
4
|
+
data.tar.gz: bcdfce52b299a4187fe846e7d8cd2e385b0ead28
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 355a1d212bccd9c4af829d45c6e1e73c0395ccf33de9dbb4a20f7f5697d9e2aaf4b816dc1d0580e82cc4637710deddb7cb7acb8db7413a948c80fb8f27e0a017
|
7
|
+
data.tar.gz: d8c07568b6b23f336065d0a1860f4d6fc83b59bb1c0ce4c17749cc82a22eb601d88301dad8279eeaea8c06f8ca7ad26fcdf4f4fd30d0a51880cd3195b8648241
|
data/README.md
CHANGED
@@ -8,11 +8,46 @@ It can be used to augment events in logstash from config, CSV file, JSON file, o
|
|
8
8
|
|
9
9
|
## Documentation
|
10
10
|
|
11
|
-
|
11
|
+
logstash-filter-augment is a logstash plugin for augmenting events with data from a config file or exteral file (in CSV, JSON, or YAML format). The filter takes a `field` parameter that specifies what is being looked up. Based on configuration, it will find the object that is referred to and add the fields of that object to your event.
|
12
|
+
|
13
|
+
In the case of a CSV file, you'll want to specify the `csv_key` to tell it which field of the file is the key (it'll default to the first column in the CSV if you don't specify). If your CSV file doesn't contain a header row, you'll need to set the `csv_header` to be an array of the column names. If you do have a header, you can still specify the `csv_header`, but be sure to also specify that you want to `csv_first_line => ignore`.
|
14
|
+
|
15
|
+
In the case of JSON, you can provide a simple dictionary that maps the keys to the objects:
|
16
|
+
```json
|
17
|
+
{
|
18
|
+
"200": { "color": "green", "message": "ok" }
|
19
|
+
}
|
20
|
+
```
|
21
|
+
or in Array format:
|
22
|
+
```json
|
23
|
+
[
|
24
|
+
{"code": 200, "color": "green", "message": "ok"}
|
25
|
+
]
|
26
|
+
```
|
27
|
+
but then you'll have to provide a `json_key => "code"` parameter in your config file to let it know which field you want to use for lookups.
|
28
|
+
|
29
|
+
YAML works the same as JSON -- you can specify either a dictionary or an array:
|
30
|
+
```yaml
|
31
|
+
200:
|
32
|
+
color: green
|
33
|
+
message: ok
|
34
|
+
404:
|
35
|
+
color: red
|
36
|
+
message: not found
|
37
|
+
```
|
38
|
+
or
|
39
|
+
```yaml
|
40
|
+
- code: 200
|
41
|
+
color: green
|
42
|
+
message: ok
|
43
|
+
- code: 404
|
44
|
+
color: red
|
45
|
+
message: not found
|
46
|
+
```
|
47
|
+
but again, you'll need to specify the `yaml_key => "code"`
|
48
|
+
|
49
|
+
Finally you can configure logstash-filter-augment statically with a dictionary:
|
12
50
|
```ruby
|
13
|
-
filter {
|
14
|
-
augment {
|
15
|
-
field => "status"
|
16
51
|
dictionary => {
|
17
52
|
"200" => {
|
18
53
|
"color" => "green"
|
@@ -23,17 +58,60 @@ filter {
|
|
23
58
|
"message" => "Missing"
|
24
59
|
}
|
25
60
|
}
|
26
|
-
|
61
|
+
default => {
|
27
62
|
"color" => "orange"
|
28
63
|
"message" => "not found"
|
29
64
|
}
|
30
65
|
}
|
31
|
-
}
|
32
66
|
```
|
33
|
-
|
34
|
-
|
35
|
-
|
36
|
-
|
67
|
+
If you choose this route, be careful that you quote your keys or you could end up with weird logstash errors.
|
68
|
+
### config parameters
|
69
|
+
| parameter | required (default)| Description |
|
70
|
+
| --------- |:---:| ---|
|
71
|
+
| field | Yes | the field of the event to look up in the dictionary |
|
72
|
+
| dictionary_path | Yes if `dictionary` isn't provided | The list of files to load |
|
73
|
+
| dictionary_type | No (auto) | The type of files provided on dictionary_path. Allowed values are `auto`, `csv`, `json`, `yaml`, and `yml` |
|
74
|
+
| dictionary | Yes if `dictionary_path` isn't provided | A dictionary to use. See example above |
|
75
|
+
| csv_header | No | The header fields of the csv_file |
|
76
|
+
| csv_first_line | No (auto) | indicates what to do with the first line of the file. Valid values are `ignore`, `header`, `data`, and `auto`. `auto` treats the first line as data if csv_header is set or `header` if it isn't |
|
77
|
+
| csv_key | No | On CSV files, which field name is the key. Defaults to the first column of the file if not set |
|
78
|
+
| csv_remove_key | No(true) | Remove the key from the object. You might want to set this to false if you don't have a `default` set so that you know which records were matched |
|
79
|
+
| csv_col_sep | No(,) | Change the column seperator for CSV files. If you need to use tabs, you have to embed a real tab in the quotes |
|
80
|
+
| csv_quote_char | No(") | Change the quote character for CSV files |
|
81
|
+
| json_key | Yes, if array | The field of the json objects to use as a key for the dictionary |
|
82
|
+
| json_remove_key | No | Similar to csv_remove_key |
|
83
|
+
| yaml_key | Yes, if array | The field of the YAML objects to use as a key for the dictionary |
|
84
|
+
| yaml_remove_key | No | Similar to csv_remove_key |
|
85
|
+
| augment_fields | No (all fields) | The fields to copy from the object to the target. If this is specified, only these fields will be copied. |
|
86
|
+
| ignore_fields | No | If this list is specified and `augment_fields` isn't, then these fields will not be copied |
|
87
|
+
| default | No | A dictionary of fields to add to the target if the key isn't in the data |
|
88
|
+
| target | No ("") | Where to target the fields. If this is left as the default "", it targets the event itself. Otherwise you can specify a valid event selector. For example, [user][location] Would set user.location.{fields from object} |
|
89
|
+
| refresh_interval | No (60) | The number of seconds between checking to see if the file has been modified. Set to -1 to disable checking, set to 0 to check on every event (not recommended)|
|
90
|
+
|
91
|
+
## Use Cases
|
92
|
+
### Geocoding by key
|
93
|
+
If you have a field that can be used to lookup a location and you have a location file, you could configure this way:
|
94
|
+
```ruby
|
95
|
+
augment {
|
96
|
+
field => "store"
|
97
|
+
target => "[location]"
|
98
|
+
dictionary_path => "geocode.csv"
|
99
|
+
csv_header => ["id","lat","lon"]
|
100
|
+
csv_key => "id"
|
101
|
+
csv_first_line => "data"
|
102
|
+
}
|
103
|
+
```
|
104
|
+
and then be sure that your mapping / mapping template changes "location" into a geo_point
|
105
|
+
### Attach multiple pieces of user data based on user key
|
106
|
+
```ruby
|
107
|
+
augment {
|
108
|
+
field => "username"
|
109
|
+
dictionary_path => ["users1.csv", "users2.csv"]
|
110
|
+
csv_header => ["username","fullName","address1","address2","city","state","zipcode"]
|
111
|
+
csv_key => "id"
|
112
|
+
csv_first_line => "ignore"
|
113
|
+
}
|
114
|
+
```
|
37
115
|
## Developing
|
38
116
|
|
39
117
|
### 1. Plugin Developement and Testing
|
@@ -59,7 +59,7 @@ class LogStash::Filters::Augment < LogStash::Filters::Base
|
|
59
59
|
# - 'ignore' skips it (csv_header must be set)
|
60
60
|
# - 'header' reads it and populates csv_header with it (csv_header must not be set)
|
61
61
|
# - 'data' reads it as data (csv_header must be set)
|
62
|
-
# - 'auto' treats the first line as data if csv_header is set or header if
|
62
|
+
# - 'auto' treats the first line as `data` if csv_header is set or `header` if it isn't
|
63
63
|
config :csv_first_line, :validate => ["data","header","ignore","auto"], :default=>"auto"
|
64
64
|
# the csv_key determines which field of the csv file is the dictionary key
|
65
65
|
# if this is not set, it will default to first column of the csv file
|
@@ -70,6 +70,10 @@ class LogStash::Filters::Augment < LogStash::Filters::Base
|
|
70
70
|
# is false then the event will have a status=200. If csv_remove_key is true, then the event won't have
|
71
71
|
# a status unless it already existed in the event.
|
72
72
|
config :csv_remove_key, :validate => :boolean, :default => true
|
73
|
+
# the column seperator for a CSV file
|
74
|
+
config :csv_col_sep, :validate => :string, :default => ","
|
75
|
+
# the quote character for a CSV file
|
76
|
+
config :csv_quote_char, :validate => :string, :default => '"'
|
73
77
|
# if the json file provided is an array, this specifies which field of the
|
74
78
|
# array of objects is the key value
|
75
79
|
config :json_key, :validate => :string
|
@@ -265,7 +269,7 @@ private
|
|
265
269
|
raise LogStash::ConfigurationError, "The csv_first_line is set to 'ignore' but csv_header is not set"
|
266
270
|
end
|
267
271
|
end
|
268
|
-
csv_lines = CSV.read(filename);
|
272
|
+
csv_lines = CSV.read(filename,{ :col_sep => @csv_col_sep, :quote_char => @csv_quote_char });
|
269
273
|
if @csv_first_line == 'header'
|
270
274
|
@csv_header = csv_lines.shift
|
271
275
|
elsif @csv_first_line == 'ignore'
|
@@ -316,6 +320,9 @@ private
|
|
316
320
|
if ! @dictionaries
|
317
321
|
return
|
318
322
|
end
|
323
|
+
if @refresh_interval < 0 && @dictionary_mtime # don't refresh if we aren't supposed to
|
324
|
+
return
|
325
|
+
end
|
319
326
|
if (@next_refresh && @next_refresh + @refresh_interval < Time.now)
|
320
327
|
return
|
321
328
|
end
|
@@ -1,6 +1,6 @@
|
|
1
1
|
Gem::Specification.new do |s|
|
2
2
|
s.name = 'logstash-filter-augment'
|
3
|
-
s.version = '0.
|
3
|
+
s.version = '0.2.0'
|
4
4
|
s.licenses = ['Apache License (2.0)']
|
5
5
|
s.summary = 'A logstash plugin to augment your events from data in files'
|
6
6
|
s.description = 'A logstash plugin that can merge data from CSV, YAML, and JSON files with events.'
|
@@ -69,6 +69,25 @@ describe LogStash::Filters::Augment do
|
|
69
69
|
expect { subject }.to raise_exception LogStash::ConfigurationError
|
70
70
|
end
|
71
71
|
end
|
72
|
+
describe "csv file with a options set" do
|
73
|
+
filename = File.join(File.dirname(__FILE__), "..", "fixtures", "test-with-tabs.txt")
|
74
|
+
config <<-CONFIG
|
75
|
+
filter {
|
76
|
+
augment {
|
77
|
+
field => "status"
|
78
|
+
dictionary_path => '#{filename}'
|
79
|
+
dictionary_type => "csv"
|
80
|
+
csv_first_line => "data"
|
81
|
+
csv_header => ["status","color","message"]
|
82
|
+
csv_col_sep => " "
|
83
|
+
}
|
84
|
+
}
|
85
|
+
CONFIG
|
86
|
+
sample("status" => "200") do
|
87
|
+
insist { subject.get("color")} == "green"
|
88
|
+
insist { subject.get("message")} == "ok"
|
89
|
+
end
|
90
|
+
end
|
72
91
|
describe "simple csv file with header ignored" do
|
73
92
|
filename = File.join(File.dirname(__FILE__), "..", "fixtures", "test-with-headers.csv")
|
74
93
|
config <<-CONFIG
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: logstash-filter-augment
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Adam Caldwell
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2017-
|
11
|
+
date: 2017-02-13 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
requirement: !ruby/object:Gem::Requirement
|
@@ -56,6 +56,7 @@ files:
|
|
56
56
|
- spec/fixtures/json-array.json
|
57
57
|
- spec/fixtures/json-hash.json
|
58
58
|
- spec/fixtures/test-with-headers.csv
|
59
|
+
- spec/fixtures/test-with-tabs.txt
|
59
60
|
- spec/fixtures/test-without-headers.csv
|
60
61
|
- spec/fixtures/yaml-array.yaml
|
61
62
|
- spec/fixtures/yaml-object.yaml
|
@@ -92,6 +93,7 @@ test_files:
|
|
92
93
|
- spec/fixtures/json-array.json
|
93
94
|
- spec/fixtures/json-hash.json
|
94
95
|
- spec/fixtures/test-with-headers.csv
|
96
|
+
- spec/fixtures/test-with-tabs.txt
|
95
97
|
- spec/fixtures/test-without-headers.csv
|
96
98
|
- spec/fixtures/yaml-array.yaml
|
97
99
|
- spec/fixtures/yaml-object.yaml
|