fluent-plugin-cloudwatch-ingest 1.1.0 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +11 -0
- data/README.md +25 -6
- data/lib/fluent/plugin/cloudwatch/ingest/version.rb +1 -1
- data/lib/fluent/plugin/in_cloudwatch_ingest.rb +1 -0
- data/lib/fluent/plugin/parser_cloudwatch_ingest.rb +42 -0
- metadata +3 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 9099fc9a7106d463763eedd19d543e9e1023231c
|
4
|
+
data.tar.gz: 2d0203a3edd8f96f7a0f25a71dadf79b1d6f8c3d
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 41d7a5122379a9e2c8ff975e61b49475e2a1ee31b7ca3bba5aa468e5913f7d1ed2f71e7175f7fb2a2fef3c5a1f844150fc9a35054e0d235152b1fe39ba18132e
|
7
|
+
data.tar.gz: 39c4c78a663a47054c147e785075de17eb70f12933eb42a1f4e821485cb7e59bed17d260babe967b22edb84687213dafdbd6f7fa0a478e61eccb13eff265cc9a
|
data/CHANGELOG.md
CHANGED
@@ -36,3 +36,14 @@
|
|
36
36
|
* Remove streams from state file that are no longer present (@chaeyk)
|
37
37
|
* Apply `error_interval` when failing to get statefile lock (@chaeyk)
|
38
38
|
* `api_interval` deprecated in favour of `error_interval`
|
39
|
+
|
40
|
+
## 1.1.0
|
41
|
+
|
42
|
+
* Update aws-sdk runtime dependency
|
43
|
+
|
44
|
+
## 1.2.0
|
45
|
+
|
46
|
+
* Add the ability to inject both the `ingestion_time returned from the the Cloudwatch Logs API, and the time that this plugin ingested the event into the record.
|
47
|
+
* Add telemetry to the parser
|
48
|
+
|
49
|
+
Both of these changes are designed to make debugging ingestion problems from high-volume log groups easier.
|
data/README.md
CHANGED
@@ -1,5 +1,5 @@
|
|
1
1
|
# Fluentd Cloudwatch Plugin
|
2
|
-
[](https://circleci.com/gh/sampointer/fluent-plugin-cloudwatch-ingest) [](https://badge.fury.io/rb/fluent-plugin-cloudwatch-ingest)  [](https://gitter.im/fluent-plugin-cloudwatch-ingest/Lobby?utm_source=share-link&utm_medium=link&utm_campaign=share-link)[](https://gemnasium.com/github.com/sampointer/fluent-plugin-cloudwatch-ingest)
|
2
|
+
[](https://circleci.com/gh/sampointer/fluent-plugin-cloudwatch-ingest) [](https://badge.fury.io/rb/fluent-plugin-cloudwatch-ingest)  [](https://gitter.im/fluent-plugin-cloudwatch-ingest/Lobby?utm_source=share-link&utm_medium=link&utm_campaign=share-link) [](https://gemnasium.com/github.com/sampointer/fluent-plugin-cloudwatch-ingest)
|
3
3
|
|
4
4
|
## Introduction
|
5
5
|
|
@@ -49,11 +49,15 @@ Or install it yourself as:
|
|
49
49
|
@type cloudwatch_ingest
|
50
50
|
expression /^(?<message>.+)$/
|
51
51
|
time_format %Y-%m-%d %H:%M:%S.%L
|
52
|
-
event_time true
|
53
|
-
inject_group_name true
|
54
|
-
inject_stream_name true
|
55
|
-
|
56
|
-
|
52
|
+
event_time true # take time from the Cloudwatch event, rather than parse it from the body
|
53
|
+
inject_group_name true # inject the group name into the record
|
54
|
+
inject_stream_name true # inject the stream name into the record
|
55
|
+
inject_cloudwatch_ingestion_time field_name # inject the `ingestion_time` as returned by the Cloudwatch API
|
56
|
+
inject_plugin_ingestion_time field_name # inject the 13 digit epoch time at which the plugin ingested the event
|
57
|
+
parse_json_body false # Attempt to parse the body as json and add structured fields from the result
|
58
|
+
fail_on_unparsable_json false # If the body cannot be parsed as json do not ingest the record
|
59
|
+
telemetry false # Produce statsd telemetry
|
60
|
+
statsd_endpoint localhost # Endpoint to which telemetry should be sent
|
57
61
|
</parse>
|
58
62
|
</source>
|
59
63
|
```
|
@@ -81,6 +85,11 @@ If `fail_on_unparsable_json` is set to `true` a record body consisting of malfor
|
|
81
85
|
|
82
86
|
The `expression` is applied before JSON parsing is attempted. One may therefore extract a JSON fragment from within the event body if it is decorated with additional free-form text.
|
83
87
|
|
88
|
+
### High volume Log Groups
|
89
|
+
If you're having ingestion problems from high volume log groups you're advised to enable telemetry in both the main plugin and the parser, and to also set both `inject_cloudwatch_ingestion_time` and `inject_plugin_ingestion_time` to `true`.
|
90
|
+
|
91
|
+
This will enable your telemetry system to plot the state of your rate limiting, the effect of the ingestion delay _inside_ Cloudwatch Logs (`timestamp` vs `ingestion_time`) and take appropriate tuning action.
|
92
|
+
|
84
93
|
### Telemetry
|
85
94
|
With `telemetry` set to `true` and a valid `statsd_endpoint` the plugin will emit telemetry in statsd format to 8125:UDP. It is up to you to configure your statsd-speaking daemon to add any prefix or tagging that you might want.
|
86
95
|
|
@@ -97,6 +106,16 @@ api.calls.getlogevents.invalid_token
|
|
97
106
|
events.emitted.success
|
98
107
|
```
|
99
108
|
|
109
|
+
Likewise when telemetry is enabled for the parser, the emitted metrics are:
|
110
|
+
|
111
|
+
```
|
112
|
+
parser.record.attempted
|
113
|
+
parser.record.success
|
114
|
+
parser.json.success # if json parsing is enabled
|
115
|
+
parser.json.failed # if json parsing is enabled
|
116
|
+
parser.ingestion_skew # the difference between `timestamp` and `ingestion_time` as returned by the Cloudwatch API
|
117
|
+
```
|
118
|
+
|
100
119
|
### Sub-second timestamps
|
101
120
|
When using `event_time true` the `@timestamp` field for the record is taken from the time recorded against the event by Cloudwatch. This is the most common mode to run in as it's an easy path to normalization: all of your Lambdas or other AWS service need not have the same, valid, `time_format` nor a regex that matches every case.
|
102
121
|
|
@@ -1,6 +1,8 @@
|
|
1
|
+
require 'date'
|
1
2
|
require 'fluent/plugin/parser_regexp'
|
2
3
|
require 'fluent/time'
|
3
4
|
require 'multi_json'
|
5
|
+
require 'statsd-ruby'
|
4
6
|
|
5
7
|
module Fluent
|
6
8
|
module Plugin
|
@@ -11,9 +13,13 @@ module Fluent
|
|
11
13
|
config_param :time_format, :string, default: '%Y-%m-%d %H:%M:%S.%L'
|
12
14
|
config_param :event_time, :bool, default: true
|
13
15
|
config_param :inject_group_name, :bool, default: true
|
16
|
+
config_param :inject_cloudwatch_ingestion_time, :string, default: false
|
17
|
+
config_param :inject_plugin_ingestion_time, :string, default: false
|
14
18
|
config_param :inject_stream_name, :bool, default: true
|
15
19
|
config_param :parse_json_body, :bool, default: false
|
16
20
|
config_param :fail_on_unparsable_json, :bool, default: false
|
21
|
+
config_param :telemetry, :bool, default: false
|
22
|
+
config_param :statsd_endpoint, :string, default: 'localhost'
|
17
23
|
|
18
24
|
def initialize
|
19
25
|
super
|
@@ -21,9 +27,21 @@ module Fluent
|
|
21
27
|
|
22
28
|
def configure(conf)
|
23
29
|
super
|
30
|
+
@statsd = Statsd.new @statsd_endpoint, 8125 if @telemetry
|
31
|
+
end
|
32
|
+
|
33
|
+
def metric(method, name, value = 0)
|
34
|
+
case method
|
35
|
+
when :increment
|
36
|
+
@statsd.send(method, name) if @telemetry
|
37
|
+
else
|
38
|
+
@statsd.send(method, name, value) if @telemetry
|
39
|
+
end
|
24
40
|
end
|
25
41
|
|
26
42
|
def parse(event, group, stream)
|
43
|
+
metric(:increment, 'parser.record.attempted')
|
44
|
+
|
27
45
|
time = nil
|
28
46
|
record = nil
|
29
47
|
super(event.message) do |t, r|
|
@@ -38,10 +56,12 @@ module Fluent
|
|
38
56
|
# message into the record we'd bork on
|
39
57
|
# nested keys. Force level one Strings.
|
40
58
|
json_body = MultiJson.load(record['message'])
|
59
|
+
metric(:increment, 'parser.json.success')
|
41
60
|
json_body.each_pair do |k, v|
|
42
61
|
record[k.to_s] = v.to_s
|
43
62
|
end
|
44
63
|
rescue MultiJson::ParseError
|
64
|
+
metric(:increment, 'parser.json.failed')
|
45
65
|
if @fail_on_unparsable_json
|
46
66
|
yield nil, nil
|
47
67
|
return
|
@@ -53,6 +73,27 @@ module Fluent
|
|
53
73
|
record['log_group_name'] = group if @inject_group_name
|
54
74
|
record['log_stream_name'] = stream if @inject_stream_name
|
55
75
|
|
76
|
+
if @inject_plugin_ingestion_time
|
77
|
+
now = DateTime.now
|
78
|
+
record[@inject_plugin_ingestion_time] = now.iso8601
|
79
|
+
end
|
80
|
+
|
81
|
+
if @inject_cloudwatch_ingestion_time
|
82
|
+
epoch_ms = event.ingestion_time.to_f / 1000
|
83
|
+
time = Time.at(epoch_ms)
|
84
|
+
record[@inject_cloudwatch_ingestion_time] =
|
85
|
+
time.to_datetime.iso8601(3)
|
86
|
+
end
|
87
|
+
|
88
|
+
# Optionally emit cloudwatch event and ingestion time skew telemetry
|
89
|
+
if @telemetry
|
90
|
+
metric(
|
91
|
+
:gauge,
|
92
|
+
'parser.ingestion_skew',
|
93
|
+
event.ingestion_time - event.timestamp
|
94
|
+
)
|
95
|
+
end
|
96
|
+
|
56
97
|
# We do String processing on the event time here to
|
57
98
|
# avoid rounding errors introduced by floating point
|
58
99
|
# arithmetic.
|
@@ -61,6 +102,7 @@ module Fluent
|
|
61
102
|
|
62
103
|
time = Fluent::EventTime.new(event_s, event_ns) if @event_time
|
63
104
|
|
105
|
+
metric(:increment, 'parser.record.success')
|
64
106
|
yield time, record
|
65
107
|
end
|
66
108
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: fluent-plugin-cloudwatch-ingest
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.
|
4
|
+
version: 1.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Sam Pointer
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2017-
|
11
|
+
date: 2017-08-18 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
@@ -168,7 +168,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
168
168
|
version: '0'
|
169
169
|
requirements: []
|
170
170
|
rubyforge_project:
|
171
|
-
rubygems_version: 2.6.
|
171
|
+
rubygems_version: 2.6.12
|
172
172
|
signing_key:
|
173
173
|
specification_version: 4
|
174
174
|
summary: Fluentd plugin to ingest AWS Cloudwatch logs
|