logstash-input-sqs_to_s3 1.5.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 1709d15716d6889de3283f18eee924966eddd2c3cc345fbeeea9d5fa40853fc8
4
+ data.tar.gz: 14a7d289d2f2d992a365024717b1d5996a3cc48c1b814bcc198542758fc2f686
5
+ SHA512:
6
+ metadata.gz: 1939b901ff07dda96895f1f45e14563298a9eb96339008d3dacd207c3c2ec501642eb32969a53999d507c7ad1daf10ecc584efd78262f172017731047ce052b2
7
+ data.tar.gz: 525aadcbc4ef730228f3ba3d3d7ad43c7c5d1c71be2feec832e5688f9b5a863dda628496f645582901e4f02c59a237ddc6d01251cd6478733a13f2ab10fffec0
@@ -0,0 +1,15 @@
1
+ ## 1.1.0
2
+ - Logstash 5 compatibility
3
+
4
+ ## 1.0.3
5
+ - added some metadata to the event (bucket and object name as commited by joshuaspence)
6
+ - also try to unzip files ending with ".gz" (ALB logs are zipped but not marked with proper Content-Encoding)
7
+
8
+ ## 1.0.2
9
+ - fix for broken UTF-8 (so we won't lose a whole s3 log file because of a single invalid line, ruby's split will die on those)
10
+
11
+ ## 1.0.1
12
+ - same (because of screwed up rubygems.org release)
13
+
14
+ ## 1.0.0
15
+ - Initial Release
@@ -0,0 +1,12 @@
1
+ The following is a list of people who have contributed ideas, code, bug
2
+ reports, or in general have helped logstash along its way.
3
+
4
+ Contributors:
5
+ * joshuaspence (event metadata)
6
+ * Heiko-san (initial contributor)
7
+ * logstash-input-sqs plugin as code base
8
+
9
+ Note: If you've sent us patches, bug reports, or otherwise contributed to
10
+ Logstash, and you aren't on the list above and want to be, please let us know
11
+ and we'll make sure you're here. Contributions from folks like you are what make
12
+ open source awesome.
data/Gemfile ADDED
@@ -0,0 +1,2 @@
1
+ source 'https://rubygems.org'
2
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,13 @@
1
+ Copyright (c) 2012–2015 Elasticsearch <http://www.elastic.co>
2
+
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+
7
+ http://www.apache.org/licenses/LICENSE-2.0
8
+
9
+ Unless required by applicable law or agreed to in writing, software
10
+ distributed under the License is distributed on an "AS IS" BASIS,
11
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ See the License for the specific language governing permissions and
13
+ limitations under the License.
@@ -0,0 +1,5 @@
1
+ Elasticsearch
2
+ Copyright 2012-2015 Elasticsearch
3
+
4
+ This product includes software developed by The Apache Software
5
+ Foundation (http://www.apache.org/).
@@ -0,0 +1,86 @@
1
+ # Logstash Plugin
2
+
3
+ This is a plugin for [Logstash](https://github.com/elastic/logstash).
4
+
5
+ It is fully free and fully open source. The license is Apache 2.0.
6
+
7
+ ## Documentation
8
+
9
+ Logstash provides infrastructure to automatically generate documentation for this plugin. We use the asciidoc format to write documentation so any comments in the source code will be first converted into asciidoc and then into html. All plugin documentation are placed under one [central location](http://www.elastic.co/guide/en/logstash/current/).
10
+
11
+ - For formatting code or config example, you can use the asciidoc `[source,ruby]` directive
12
+ - For more asciidoc formatting tips, see the excellent reference here https://github.com/elastic/docs#asciidoc-guide
13
+
14
+ ## Need Help?
15
+
16
+ Need help? Try #logstash on freenode IRC or the https://discuss.elastic.co/c/logstash discussion forum.
17
+
18
+ ## Developing
19
+
20
+ ### 1. Plugin Developement and Testing
21
+
22
+ #### Code
23
+ - To get started, you'll need JRuby with the Bundler gem installed.
24
+
25
+ - Create a new plugin or clone and existing from the GitHub [logstash-plugins](https://github.com/logstash-plugins) organization. We also provide [example plugins](https://github.com/logstash-plugins?query=example).
26
+
27
+ - Install dependencies
28
+ ```sh
29
+ bundle install
30
+ ```
31
+
32
+ #### Test
33
+
34
+ - Update your dependencies
35
+
36
+ ```sh
37
+ bundle install
38
+ ```
39
+
40
+ - Run tests
41
+
42
+ ```sh
43
+ bundle exec rspec
44
+ ```
45
+
46
+ ### 2. Running your unpublished Plugin in Logstash
47
+
48
+ #### 2.1 Run in a local Logstash clone
49
+
50
+ - Edit Logstash `Gemfile` and add the local plugin path, for example:
51
+ ```ruby
52
+ gem "logstash-filter-awesome", :path => "/your/local/logstash-filter-awesome"
53
+ ```
54
+ - Install plugin
55
+ ```sh
56
+ bin/plugin install --no-verify
57
+ ```
58
+ - Run Logstash with your plugin
59
+ ```sh
60
+ bin/logstash -e 'filter {awesome {}}'
61
+ ```
62
+ At this point any modifications to the plugin code will be applied to this local Logstash setup. After modifying the plugin, simply rerun Logstash.
63
+
64
+ #### 2.2 Run in an installed Logstash
65
+
66
+ You can use the same **2.1** method to run your plugin in an installed Logstash by editing its `Gemfile` and pointing the `:path` to your local plugin development directory or you can build the gem and install it using:
67
+
68
+ - Build your plugin gem
69
+ ```sh
70
+ gem build logstash-filter-awesome.gemspec
71
+ ```
72
+ - Install the plugin from the Logstash home
73
+ ```sh
74
+ bin/plugin install /your/local/plugin/logstash-filter-awesome.gem
75
+ ```
76
+ - Start Logstash and proceed to test the plugin
77
+
78
+ ## Contributing
79
+
80
+ All contributions are welcome: ideas, patches, documentation, bug reports, complaints, and even something you drew up on a napkin.
81
+
82
+ Programming is not a required skill. Whatever you've seen about open source and maintainers or community members saying "send patches or die" - you will not see that here.
83
+
84
+ It is more important to the community that you are able to contribute.
85
+
86
+ For more information about contributing, see the [CONTRIBUTING](https://github.com/elastic/logstash/blob/master/CONTRIBUTING.md) file.
@@ -0,0 +1,220 @@
1
+ # encoding: utf-8
2
+ #
3
+ require "logstash/inputs/threadable"
4
+ require "logstash/namespace"
5
+ require "logstash/timestamp"
6
+ require "logstash/plugin_mixins/aws_config"
7
+ require "logstash/errors"
8
+
9
+ # Get logs from AWS s3 buckets as issued by an object-created event via sqs.
10
+ #
11
+ # This plugin is based on the logstash-input-sqs plugin but doesn't log the sqs event itself.
12
+ # Instead it assumes, that the event is an s3 object-created event and will then download
13
+ # and process the given file.
14
+ #
15
+ # Some issues of logstash-input-sqs, like logstash not shutting down properly, have been
16
+ # fixed for this plugin.
17
+ #
18
+ # In contrast to logstash-input-sqs this plugin uses the "Receive Message Wait Time"
19
+ # configured for the sqs queue in question, a good value will be something like 10 seconds
20
+ # to ensure a reasonable shutdown time of logstash.
21
+ # Also use a "Default Visibility Timeout" that is high enough for log files to be downloaded
22
+ # and processed (I think a good value should be 5-10 minutes for most use cases), the plugin will
23
+ # avoid removing the event from the queue if the associated log file couldn't be correctly
24
+ # passed to the processing level of logstash (e.g. downloaded content size doesn't match sqs event).
25
+ #
26
+ # This plugin is meant for high availability setups, in contrast to logstash-input-s3 you can safely
27
+ # use multiple logstash nodes, since the usage of sqs will ensure that each logfile is processed
28
+ # only once and no file will get lost on node failure or downscaling for auto-scaling groups.
29
+ # (You should use a "Message Retention Period" >= 4 days for your sqs to ensure you can survive
30
+ # a weekend of faulty log file processing)
31
+ # The plugin will not delete objects from s3 buckets, so make sure to have a reasonable "Lifecycle"
32
+ # configured for your buckets, which should keep the files at least "Message Retention Period" days.
33
+ #
34
+ # A typical setup will contain some s3 buckets containing elb, cloudtrail or other log files.
35
+ # These will be configured to send object-created events to a sqs queue, which will be configured
36
+ # as the source queue for this plugin.
37
+ # (The plugin supports gzipped content if it is marked with "contend-encoding: gzip" as it is the
38
+ # case for cloudtrail logs)
39
+ #
40
+ # The logstash node therefore must have sqs permissions + the permissions to download objects
41
+ # from the s3 buckets that send events to the queue.
42
+ # (If logstash nodes are running on EC2 you should use a ServerRole to provide permissions)
43
+ # [source,json]
44
+ # {
45
+ # "Version": "2012-10-17",
46
+ # "Statement": [
47
+ # {
48
+ # "Effect": "Allow",
49
+ # "Action": [
50
+ # "sqs:Get*",
51
+ # "sqs:List*",
52
+ # "sqs:ReceiveMessage",
53
+ # "sqs:ChangeMessageVisibility*",
54
+ # "sqs:DeleteMessage*"
55
+ # ],
56
+ # "Resource": [
57
+ # "arn:aws:sqs:us-east-1:123456789012:my-elb-log-queue"
58
+ # ]
59
+ # },
60
+ # {
61
+ # "Effect": "Allow",
62
+ # "Action": [
63
+ # "s3:Get*",
64
+ # "s3:List*"
65
+ # ],
66
+ # "Resource": [
67
+ # "arn:aws:s3:::my-elb-logs",
68
+ # "arn:aws:s3:::my-elb-logs/*"
69
+ # ]
70
+ # }
71
+ # ]
72
+ # }
73
+ #
74
+ class LogStash::Inputs::S3SQS < LogStash::Inputs::Threadable
75
+ include LogStash::PluginMixins::AwsConfig::V2
76
+
77
+ BACKOFF_SLEEP_TIME = 1
78
+ BACKOFF_FACTOR = 2
79
+ MAX_TIME_BEFORE_GIVING_UP = 60
80
+ EVENT_SOURCE = 'aws:s3'
81
+ EVENT_TYPE = 'ObjectCreated'
82
+
83
+ config_name "s3sqs"
84
+
85
+ default :codec, "plain"
86
+
87
+ # Name of the SQS Queue to pull messages from. Note that this is just the name of the queue, not the URL or ARN.
88
+ config :queue, :validate => :string, :required => true
89
+
90
+ attr_reader :poller
91
+ attr_reader :s3
92
+
93
+ def register
94
+ require "aws-sdk"
95
+ @logger.info("Registering SQS input", :queue => @queue)
96
+ setup_queue
97
+ end
98
+
99
+ def setup_queue
100
+ aws_sqs_client = Aws::SQS::Client.new(aws_options_hash)
101
+ queue_url = aws_sqs_client.get_queue_url(:queue_name => @queue)[:queue_url]
102
+ @poller = Aws::SQS::QueuePoller.new(queue_url, :client => aws_sqs_client)
103
+ @s3 = Aws::S3::Client.new(aws_options_hash)
104
+ rescue Aws::SQS::Errors::ServiceError => e
105
+ @logger.error("Cannot establish connection to Amazon SQS", :error => e)
106
+ raise LogStash::ConfigurationError, "Verify the SQS queue name and your credentials"
107
+ end
108
+
109
+ def polling_options
110
+ {
111
+ # we will query 1 message at a time, so we can ensure correct error handling if we can't download a single file correctly
112
+ # (we will throw :skip_delete if download size isn't correct to process the event again later
113
+ # -> set a reasonable "Default Visibility Timeout" for your queue, so that there's enough time to process the log files)
114
+ :max_number_of_messages => 1,
115
+ # we will use the queue's setting, a good value is 10 seconds
116
+ # (to ensure fast logstash shutdown on the one hand and few api calls on the other hand)
117
+ :wait_time_seconds => nil,
118
+ }
119
+ end
120
+
121
+ def handle_message(message, queue)
122
+ hash = JSON.parse message.body
123
+ # there may be test events sent from the s3 bucket which won't contain a Records array,
124
+ # we will skip those events and remove them from queue
125
+ if hash['Records'] then
126
+ # typically there will be only 1 record per event, but since it is an array we will
127
+ # treat it as if there could be more records
128
+ hash['Records'].each do |record|
129
+ # in case there are any events with Records that aren't s3 object-created events and can't therefore be
130
+ # processed by this plugin, we will skip them and remove them from queue
131
+ if record['eventSource'] == EVENT_SOURCE and record['eventName'].start_with?(EVENT_TYPE) then
132
+ # try download and :skip_delete if it fails
133
+ begin
134
+ response = @s3.get_object(
135
+ bucket: record['s3']['bucket']['name'],
136
+ key: record['s3']['object']['key']
137
+ )
138
+ rescue => e
139
+ @logger.warn("issuing :skip_delete on failed download", :bucket => record['s3']['bucket']['name'], :object => record['s3']['object']['key'], :error => e)
140
+ throw :skip_delete
141
+ end
142
+ # verify downloaded content size
143
+ if response.content_length == record['s3']['object']['size'] then
144
+ body = response.body
145
+ # if necessary unzip
146
+ if response.content_encoding == "gzip" or record['s3']['object']['key'].end_with?(".gz") then
147
+ begin
148
+ temp = Zlib::GzipReader.new(body)
149
+ rescue => e
150
+ @logger.warn("content is marked to be gzipped but can't unzip it, assuming plain text", :bucket => record['s3']['bucket']['name'], :object => record['s3']['object']['key'], :error => e)
151
+ temp = body
152
+ end
153
+ body = temp
154
+ end
155
+ # process the plain text content
156
+ begin
157
+ lines = body.read.encode('UTF-8', 'binary', invalid: :replace, undef: :replace, replace: "\u2370").split(/\n/)
158
+ lines.each do |line|
159
+ @codec.decode(line) do |event|
160
+ decorate(event)
161
+
162
+ event.set('[@metadata][s3_bucket_name]', record['s3']['bucket']['name'])
163
+ event.set('[@metadata][s3_object_key]', record['s3']['object']['key'])
164
+
165
+ queue << event
166
+ end
167
+ end
168
+ rescue => e
169
+ @logger.warn("issuing :skip_delete on failed plain text processing", :bucket => record['s3']['bucket']['name'], :object => record['s3']['object']['key'], :error => e)
170
+ throw :skip_delete
171
+ end
172
+ # otherwise try again later
173
+ else
174
+ @logger.warn("issuing :skip_delete on wrong download content size", :bucket => record['s3']['bucket']['name'], :object => record['s3']['object']['key'],
175
+ :download_size => response.content_length, :expected => record['s3']['object']['size'])
176
+ throw :skip_delete
177
+ end
178
+ end
179
+ end
180
+ end
181
+ end
182
+
183
+ def run(queue)
184
+ # ensure we can stop logstash correctly
185
+ poller.before_request do |stats|
186
+ if stop? then
187
+ @logger.warn("issuing :stop_polling on stop?", :queue => @queue)
188
+ # this can take up to "Receive Message Wait Time" (of the sqs queue) seconds to be recognized
189
+ throw :stop_polling
190
+ end
191
+ end
192
+ # poll a message and process it
193
+ run_with_backoff do
194
+ poller.poll(polling_options) do |message|
195
+ handle_message(message, queue)
196
+ end
197
+ end
198
+ end
199
+
200
+ private
201
+ # Runs an AWS request inside a Ruby block with an exponential backoff in case
202
+ # we experience a ServiceError.
203
+ #
204
+ # @param [Integer] max_time maximum amount of time to sleep before giving up.
205
+ # @param [Integer] sleep_time the initial amount of time to sleep before retrying.
206
+ # @param [Block] block Ruby code block to execute.
207
+ def run_with_backoff(max_time = MAX_TIME_BEFORE_GIVING_UP, sleep_time = BACKOFF_SLEEP_TIME, &block)
208
+ next_sleep = sleep_time
209
+ begin
210
+ block.call
211
+ next_sleep = sleep_time
212
+ rescue Aws::SQS::Errors::ServiceError => e
213
+ @logger.warn("Aws::SQS::Errors::ServiceError ... retrying SQS request with exponential backoff", :queue => @queue, :sleep_time => sleep_time, :error => e)
214
+ sleep(next_sleep)
215
+ next_sleep = next_sleep > max_time ? sleep_time : sleep_time * BACKOFF_FACTOR
216
+ retry
217
+ end
218
+ end
219
+
220
+ end # class
@@ -0,0 +1,29 @@
1
+ Gem::Specification.new do |s|
2
+ s.name = 'logstash-input-sqs_to_s3'
3
+ s.version = '1.5.0'
4
+ s.licenses = ['Apache License (2.0)']
5
+ s.summary = "Get logs from AWS s3 buckets as issued by an object-created event via sqs."
6
+ s.description = "This gem is a logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/plugin install gemname. This gem is not a stand-alone program. This version works with Logstash 5."
7
+ s.authors = ["Heiko Finzel"]
8
+ s.email = 'hfi@boreus.de'
9
+ s.homepage = "https://www.boreus.de"
10
+ s.require_paths = ["lib"]
11
+
12
+ # Files
13
+ s.files = Dir['lib/**/*','spec/**/*','vendor/**/*','*.gemspec','*.md','CONTRIBUTORS','Gemfile','LICENSE','NOTICE.TXT']
14
+
15
+ # Tests
16
+ s.test_files = s.files.grep(%r{^(test|spec|features)/})
17
+
18
+ # Special flag to let us know this is actually a logstash plugin
19
+ s.metadata = { "logstash_plugin" => "true", "logstash_group" => "input" }
20
+
21
+ # Gem dependencies
22
+ s.add_runtime_dependency "logstash-core-plugin-api", ">= 1.60", "<= 2.99"
23
+
24
+ s.add_runtime_dependency 'logstash-codec-json'
25
+ s.add_runtime_dependency "logstash-mixin-aws", ">= 1.0.0"
26
+
27
+ s.add_development_dependency 'logstash-devutils'
28
+ end
29
+
@@ -0,0 +1,9 @@
1
+ # encoding: utf-8
2
+ require "logstash/devutils/rspec/spec_helper"
3
+ require "logstash/inputs/s3sqs"
4
+
5
+ describe LogStash::Inputs::S3SQS do
6
+
7
+ true.should be_true
8
+
9
+ end
@@ -0,0 +1,2 @@
1
+ # encoding: utf-8
2
+ require "logstash/devutils/rspec/spec_helper"
metadata ADDED
@@ -0,0 +1,121 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: logstash-input-sqs_to_s3
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.5.0
5
+ platform: ruby
6
+ authors:
7
+ - Heiko Finzel
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2017-11-02 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ requirement: !ruby/object:Gem::Requirement
15
+ requirements:
16
+ - - ">="
17
+ - !ruby/object:Gem::Version
18
+ version: '1.60'
19
+ - - "<="
20
+ - !ruby/object:Gem::Version
21
+ version: '2.99'
22
+ name: logstash-core-plugin-api
23
+ prerelease: false
24
+ type: :runtime
25
+ version_requirements: !ruby/object:Gem::Requirement
26
+ requirements:
27
+ - - ">="
28
+ - !ruby/object:Gem::Version
29
+ version: '1.60'
30
+ - - "<="
31
+ - !ruby/object:Gem::Version
32
+ version: '2.99'
33
+ - !ruby/object:Gem::Dependency
34
+ requirement: !ruby/object:Gem::Requirement
35
+ requirements:
36
+ - - ">="
37
+ - !ruby/object:Gem::Version
38
+ version: '0'
39
+ name: logstash-codec-json
40
+ prerelease: false
41
+ type: :runtime
42
+ version_requirements: !ruby/object:Gem::Requirement
43
+ requirements:
44
+ - - ">="
45
+ - !ruby/object:Gem::Version
46
+ version: '0'
47
+ - !ruby/object:Gem::Dependency
48
+ requirement: !ruby/object:Gem::Requirement
49
+ requirements:
50
+ - - ">="
51
+ - !ruby/object:Gem::Version
52
+ version: 1.0.0
53
+ name: logstash-mixin-aws
54
+ prerelease: false
55
+ type: :runtime
56
+ version_requirements: !ruby/object:Gem::Requirement
57
+ requirements:
58
+ - - ">="
59
+ - !ruby/object:Gem::Version
60
+ version: 1.0.0
61
+ - !ruby/object:Gem::Dependency
62
+ requirement: !ruby/object:Gem::Requirement
63
+ requirements:
64
+ - - ">="
65
+ - !ruby/object:Gem::Version
66
+ version: '0'
67
+ name: logstash-devutils
68
+ prerelease: false
69
+ type: :development
70
+ version_requirements: !ruby/object:Gem::Requirement
71
+ requirements:
72
+ - - ">="
73
+ - !ruby/object:Gem::Version
74
+ version: '0'
75
+ description: This gem is a logstash plugin required to be installed on top of the
76
+ Logstash core pipeline using $LS_HOME/bin/plugin install gemname. This gem is not
77
+ a stand-alone program. This version works with Logstash 5.
78
+ email: hfi@boreus.de
79
+ executables: []
80
+ extensions: []
81
+ extra_rdoc_files: []
82
+ files:
83
+ - CHANGELOG.md
84
+ - CONTRIBUTORS
85
+ - Gemfile
86
+ - LICENSE
87
+ - NOTICE.TXT
88
+ - README.md
89
+ - lib/logstash/inputs/s3sqs.rb
90
+ - logstash-input-sqs_to_s3.gemspec
91
+ - spec/inputs/s3sqs_spec.rb
92
+ - spec/spec_helper.rb
93
+ homepage: https://www.boreus.de
94
+ licenses:
95
+ - Apache License (2.0)
96
+ metadata:
97
+ logstash_plugin: 'true'
98
+ logstash_group: input
99
+ post_install_message:
100
+ rdoc_options: []
101
+ require_paths:
102
+ - lib
103
+ required_ruby_version: !ruby/object:Gem::Requirement
104
+ requirements:
105
+ - - ">="
106
+ - !ruby/object:Gem::Version
107
+ version: '0'
108
+ required_rubygems_version: !ruby/object:Gem::Requirement
109
+ requirements:
110
+ - - ">="
111
+ - !ruby/object:Gem::Version
112
+ version: '0'
113
+ requirements: []
114
+ rubyforge_project:
115
+ rubygems_version: 2.6.13
116
+ signing_key:
117
+ specification_version: 4
118
+ summary: Get logs from AWS s3 buckets as issued by an object-created event via sqs.
119
+ test_files:
120
+ - spec/inputs/s3sqs_spec.rb
121
+ - spec/spec_helper.rb