logstash-filter-augment 0.1.0 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: c4299e05ab87426f7d354fc3133aa115e064e07b
4
- data.tar.gz: b78aecd5b87b25e96eab92240808ccdc1221d88b
3
+ metadata.gz: 2a72a388e9e7ae66e9f689bd614d6917d2796c78
4
+ data.tar.gz: bcdfce52b299a4187fe846e7d8cd2e385b0ead28
5
5
  SHA512:
6
- metadata.gz: 7dc36059f8478636395e7f12ab12b8a7b1d283c03c35434a4e9e699211de6262d35efab81c155ff443f03190e11a18e601bd5701342dc95a33d4551bae9cb22b
7
- data.tar.gz: 2f4f7d31525289ac828bc17cdafd38c23e1f82a96456f8047a8fe2cac2d5ef988240d5a7f439ad590de5b9a94fb9fe384c2bd531181bc0dc77cd8d97807a6100
6
+ metadata.gz: 355a1d212bccd9c4af829d45c6e1e73c0395ccf33de9dbb4a20f7f5697d9e2aaf4b816dc1d0580e82cc4637710deddb7cb7acb8db7413a948c80fb8f27e0a017
7
+ data.tar.gz: d8c07568b6b23f336065d0a1860f4d6fc83b59bb1c0ce4c17749cc82a22eb601d88301dad8279eeaea8c06f8ca7ad26fcdf4f4fd30d0a51880cd3195b8648241
data/README.md CHANGED
@@ -8,11 +8,46 @@ It can be used to augment events in logstash from config, CSV file, JSON file, o
8
8
 
9
9
  ## Documentation
10
10
 
11
- The logstash-filter-augment plugin can be configured statically like this:
11
+ logstash-filter-augment is a logstash plugin for augmenting events with data from a config file or exteral file (in CSV, JSON, or YAML format). The filter takes a `field` parameter that specifies what is being looked up. Based on configuration, it will find the object that is referred to and add the fields of that object to your event.
12
+
13
+ In the case of a CSV file, you'll want to specify the `csv_key` to tell it which field of the file is the key (it'll default to the first column in the CSV if you don't specify). If your CSV file doesn't contain a header row, you'll need to set the `csv_header` to be an array of the column names. If you do have a header, you can still specify the `csv_header`, but be sure to also specify that you want to `csv_first_line => ignore`.
14
+
15
+ In the case of JSON, you can provide a simple dictionary that maps the keys to the objects:
16
+ ```json
17
+ {
18
+ "200": { "color": "green", "message": "ok" }
19
+ }
20
+ ```
21
+ or in Array format:
22
+ ```json
23
+ [
24
+ {"code": 200, "color": "green", "message": "ok"}
25
+ ]
26
+ ```
27
+ but then you'll have to provide a `json_key => "code"` parameter in your config file to let it know which field you want to use for lookups.
28
+
29
+ YAML works the same as JSON -- you can specify either a dictionary or an array:
30
+ ```yaml
31
+ 200:
32
+ color: green
33
+ message: ok
34
+ 404:
35
+ color: red
36
+ message: not found
37
+ ```
38
+ or
39
+ ```yaml
40
+ - code: 200
41
+ color: green
42
+ message: ok
43
+ - code: 404
44
+ color: red
45
+ message: not found
46
+ ```
47
+ but again, you'll need to specify the `yaml_key => "code"`
48
+
49
+ Finally you can configure logstash-filter-augment statically with a dictionary:
12
50
  ```ruby
13
- filter {
14
- augment {
15
- field => "status"
16
51
  dictionary => {
17
52
  "200" => {
18
53
  "color" => "green"
@@ -23,17 +58,60 @@ filter {
23
58
  "message" => "Missing"
24
59
  }
25
60
  }
26
- augment_default => {
61
+ default => {
27
62
  "color" => "orange"
28
63
  "message" => "not found"
29
64
  }
30
65
  }
31
- }
32
66
  ```
33
- And then when an event with status=200 in, it will add color=green and message=OK to the event
34
-
35
- Additionally you use a CSV, YAML, or JSON file to define the mapping.
36
-
67
+ If you choose this route, be careful that you quote your keys or you could end up with weird logstash errors.
68
+ ### config parameters
69
+ | parameter | required (default)| Description |
70
+ | --------- |:---:| ---|
71
+ | field | Yes | the field of the event to look up in the dictionary |
72
+ | dictionary_path | Yes if `dictionary` isn't provided | The list of files to load |
73
+ | dictionary_type | No (auto) | The type of files provided on dictionary_path. Allowed values are `auto`, `csv`, `json`, `yaml`, and `yml` |
74
+ | dictionary | Yes if `dictionary_path` isn't provided | A dictionary to use. See example above |
75
+ | csv_header | No | The header fields of the csv_file |
76
+ | csv_first_line | No (auto) | indicates what to do with the first line of the file. Valid values are `ignore`, `header`, `data`, and `auto`. `auto` treats the first line as data if csv_header is set or `header` if it isn't |
77
+ | csv_key | No | On CSV files, which field name is the key. Defaults to the first column of the file if not set |
78
+ | csv_remove_key | No(true) | Remove the key from the object. You might want to set this to false if you don't have a `default` set so that you know which records were matched |
79
+ | csv_col_sep | No(,) | Change the column seperator for CSV files. If you need to use tabs, you have to embed a real tab in the quotes |
80
+ | csv_quote_char | No(") | Change the quote character for CSV files |
81
+ | json_key | Yes, if array | The field of the json objects to use as a key for the dictionary |
82
+ | json_remove_key | No | Similar to csv_remove_key |
83
+ | yaml_key | Yes, if array | The field of the YAML objects to use as a key for the dictionary |
84
+ | yaml_remove_key | No | Similar to csv_remove_key |
85
+ | augment_fields | No (all fields) | The fields to copy from the object to the target. If this is specified, only these fields will be copied. |
86
+ | ignore_fields | No | If this list is specified and `augment_fields` isn't, then these fields will not be copied |
87
+ | default | No | A dictionary of fields to add to the target if the key isn't in the data |
88
+ | target | No ("") | Where to target the fields. If this is left as the default "", it targets the event itself. Otherwise you can specify a valid event selector. For example, [user][location] Would set user.location.{fields from object} |
89
+ | refresh_interval | No (60) | The number of seconds between checking to see if the file has been modified. Set to -1 to disable checking, set to 0 to check on every event (not recommended)|
90
+
91
+ ## Use Cases
92
+ ### Geocoding by key
93
+ If you have a field that can be used to lookup a location and you have a location file, you could configure this way:
94
+ ```ruby
95
+ augment {
96
+ field => "store"
97
+ target => "[location]"
98
+ dictionary_path => "geocode.csv"
99
+ csv_header => ["id","lat","lon"]
100
+ csv_key => "id"
101
+ csv_first_line => "data"
102
+ }
103
+ ```
104
+ and then be sure that your mapping / mapping template changes "location" into a geo_point
105
+ ### Attach multiple pieces of user data based on user key
106
+ ```ruby
107
+ augment {
108
+ field => "username"
109
+ dictionary_path => ["users1.csv", "users2.csv"]
110
+ csv_header => ["username","fullName","address1","address2","city","state","zipcode"]
111
+ csv_key => "id"
112
+ csv_first_line => "ignore"
113
+ }
114
+ ```
37
115
  ## Developing
38
116
 
39
117
  ### 1. Plugin Developement and Testing
@@ -59,7 +59,7 @@ class LogStash::Filters::Augment < LogStash::Filters::Base
59
59
  # - 'ignore' skips it (csv_header must be set)
60
60
  # - 'header' reads it and populates csv_header with it (csv_header must not be set)
61
61
  # - 'data' reads it as data (csv_header must be set)
62
- # - 'auto' treats the first line as data if csv_header is set or header if csv_data isn't set
62
+ # - 'auto' treats the first line as `data` if csv_header is set or `header` if it isn't
63
63
  config :csv_first_line, :validate => ["data","header","ignore","auto"], :default=>"auto"
64
64
  # the csv_key determines which field of the csv file is the dictionary key
65
65
  # if this is not set, it will default to first column of the csv file
@@ -70,6 +70,10 @@ class LogStash::Filters::Augment < LogStash::Filters::Base
70
70
  # is false then the event will have a status=200. If csv_remove_key is true, then the event won't have
71
71
  # a status unless it already existed in the event.
72
72
  config :csv_remove_key, :validate => :boolean, :default => true
73
+ # the column seperator for a CSV file
74
+ config :csv_col_sep, :validate => :string, :default => ","
75
+ # the quote character for a CSV file
76
+ config :csv_quote_char, :validate => :string, :default => '"'
73
77
  # if the json file provided is an array, this specifies which field of the
74
78
  # array of objects is the key value
75
79
  config :json_key, :validate => :string
@@ -265,7 +269,7 @@ private
265
269
  raise LogStash::ConfigurationError, "The csv_first_line is set to 'ignore' but csv_header is not set"
266
270
  end
267
271
  end
268
- csv_lines = CSV.read(filename);
272
+ csv_lines = CSV.read(filename,{ :col_sep => @csv_col_sep, :quote_char => @csv_quote_char });
269
273
  if @csv_first_line == 'header'
270
274
  @csv_header = csv_lines.shift
271
275
  elsif @csv_first_line == 'ignore'
@@ -316,6 +320,9 @@ private
316
320
  if ! @dictionaries
317
321
  return
318
322
  end
323
+ if @refresh_interval < 0 && @dictionary_mtime # don't refresh if we aren't supposed to
324
+ return
325
+ end
319
326
  if (@next_refresh && @next_refresh + @refresh_interval < Time.now)
320
327
  return
321
328
  end
@@ -1,6 +1,6 @@
1
1
  Gem::Specification.new do |s|
2
2
  s.name = 'logstash-filter-augment'
3
- s.version = '0.1.0'
3
+ s.version = '0.2.0'
4
4
  s.licenses = ['Apache License (2.0)']
5
5
  s.summary = 'A logstash plugin to augment your events from data in files'
6
6
  s.description = 'A logstash plugin that can merge data from CSV, YAML, and JSON files with events.'
@@ -69,6 +69,25 @@ describe LogStash::Filters::Augment do
69
69
  expect { subject }.to raise_exception LogStash::ConfigurationError
70
70
  end
71
71
  end
72
+ describe "csv file with a options set" do
73
+ filename = File.join(File.dirname(__FILE__), "..", "fixtures", "test-with-tabs.txt")
74
+ config <<-CONFIG
75
+ filter {
76
+ augment {
77
+ field => "status"
78
+ dictionary_path => '#{filename}'
79
+ dictionary_type => "csv"
80
+ csv_first_line => "data"
81
+ csv_header => ["status","color","message"]
82
+ csv_col_sep => " "
83
+ }
84
+ }
85
+ CONFIG
86
+ sample("status" => "200") do
87
+ insist { subject.get("color")} == "green"
88
+ insist { subject.get("message")} == "ok"
89
+ end
90
+ end
72
91
  describe "simple csv file with header ignored" do
73
92
  filename = File.join(File.dirname(__FILE__), "..", "fixtures", "test-with-headers.csv")
74
93
  config <<-CONFIG
@@ -0,0 +1,2 @@
1
+ 200 green ok
2
+ 404 red not found
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: logstash-filter-augment
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Adam Caldwell
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2017-01-16 00:00:00.000000000 Z
11
+ date: 2017-02-13 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  requirement: !ruby/object:Gem::Requirement
@@ -56,6 +56,7 @@ files:
56
56
  - spec/fixtures/json-array.json
57
57
  - spec/fixtures/json-hash.json
58
58
  - spec/fixtures/test-with-headers.csv
59
+ - spec/fixtures/test-with-tabs.txt
59
60
  - spec/fixtures/test-without-headers.csv
60
61
  - spec/fixtures/yaml-array.yaml
61
62
  - spec/fixtures/yaml-object.yaml
@@ -92,6 +93,7 @@ test_files:
92
93
  - spec/fixtures/json-array.json
93
94
  - spec/fixtures/json-hash.json
94
95
  - spec/fixtures/test-with-headers.csv
96
+ - spec/fixtures/test-with-tabs.txt
95
97
  - spec/fixtures/test-without-headers.csv
96
98
  - spec/fixtures/yaml-array.yaml
97
99
  - spec/fixtures/yaml-object.yaml