logstash-input-packetloop_s3 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/CHANGELOG.md +4 -0
- data/CONTRIBUTORS +10 -0
- data/DEVELOPER.md +2 -0
- data/Gemfile +3 -0
- data/LICENSE +11 -0
- data/README.md +112 -0
- data/lib/logstash/inputs/packetloop_s3.rb +465 -0
- data/logstash-input-packetloop_s3.gemspec +25 -0
- data/spec/inputs/packetloop_s3_spec.rb +11 -0
- metadata +114 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 542305755f5e4673fdc431ea518ae26417055f16
|
4
|
+
data.tar.gz: ebfddd8fe01e87ffcc9384b6b6ec7926b1fabfb0
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: a669f6fee233aa58619da3692e22203337b6f44d6fd3a004f22f9393db7a43759c8470b0a3da1a36e20429239830328da83f17c2d3ff4978e09321e03cfeeeda
|
7
|
+
data.tar.gz: fd6ae6d4f45f989b5bcd3d2baf7d752936e21c3ba7292e37ad0f3711db167a6c3bd69b21257bd5bd06691567c3cf133fa1c11ad50fdbddc229343fd69f1def2b
|
data/CHANGELOG.md
ADDED
data/CONTRIBUTORS
ADDED
@@ -0,0 +1,10 @@
|
|
1
|
+
The following is a list of people who have contributed ideas, code, bug
|
2
|
+
reports, or in general have helped logstash along its way.
|
3
|
+
|
4
|
+
Contributors:
|
5
|
+
* Lenfree Yeung - lenfree.yeung@gmail.com
|
6
|
+
|
7
|
+
Note: If you've sent us patches, bug reports, or otherwise contributed to
|
8
|
+
Logstash, and you aren't on the list above and want to be, please let us know
|
9
|
+
and we'll make sure you're here. Contributions from folks like you are what make
|
10
|
+
open source awesome.
|
data/DEVELOPER.md
ADDED
data/Gemfile
ADDED
data/LICENSE
ADDED
@@ -0,0 +1,11 @@
|
|
1
|
+
Licensed under the Apache License, Version 2.0 (the "License");
|
2
|
+
you may not use this file except in compliance with the License.
|
3
|
+
You may obtain a copy of the License at
|
4
|
+
|
5
|
+
http://www.apache.org/licenses/LICENSE-2.0
|
6
|
+
|
7
|
+
Unless required by applicable law or agreed to in writing, software
|
8
|
+
distributed under the License is distributed on an "AS IS" BASIS,
|
9
|
+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
10
|
+
See the License for the specific language governing permissions and
|
11
|
+
limitations under the License.
|
data/README.md
ADDED
@@ -0,0 +1,112 @@
|
|
1
|
+
# Logstash Plugin - packetloop_s3
|
2
|
+
|
3
|
+
This is a fork plugin of S3 input that only reads Cloudwatch logs from S3 for [Logstash](https://github.com/elastic/logstash).
|
4
|
+
Unfortunately, Cloudwatch logs is already compressed without .gz extension. With current S3 input plugin, it checks whether
|
5
|
+
file is gzip compressed or not by inspecting file extension and therefore umable able to read Cloudwatch logs streamed via
|
6
|
+
Firehose or by other means.
|
7
|
+
|
8
|
+
This plugin is supposed to be temporary workaround until PR https://github.com/logstash-plugins/logstash-input-s3/issues/165
|
9
|
+
is merged. Hence, this does not have tests.
|
10
|
+
|
11
|
+
It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way.
|
12
|
+
|
13
|
+
## Usage:
|
14
|
+
```bash
|
15
|
+
|
16
|
+
input {
|
17
|
+
packetloop_s3 {
|
18
|
+
access_key_id => "xxx"
|
19
|
+
secret_access_key => "xxx"$
|
20
|
+
region => "us-east-1"
|
21
|
+
delete => true
|
22
|
+
force_gzipp_decompress => true
|
23
|
+
codec => cloudwatch_logs {
|
24
|
+
decompress => false
|
25
|
+
}
|
26
|
+
interval => 30
|
27
|
+
bucket => "bucket"
|
28
|
+
type => "cloudwatch"
|
29
|
+
}
|
30
|
+
}
|
31
|
+
|
32
|
+
|
33
|
+
## Documentation
|
34
|
+
|
35
|
+
Logstash provides infrastructure to automatically generate documentation for this plugin. We use the asciidoc format to write documentation so any comments in the source code will be first converted into asciidoc and then into html. All plugin documentation are placed under one [central location](http://www.elastic.co/guide/en/logstash/current/).
|
36
|
+
|
37
|
+
- For formatting code or config example, you can use the asciidoc `[source,ruby]` directive
|
38
|
+
- For more asciidoc formatting tips, see the excellent reference here https://github.com/elastic/docs#asciidoc-guide
|
39
|
+
|
40
|
+
## Need Help?
|
41
|
+
|
42
|
+
Need help? Try #logstash on freenode IRC or the https://discuss.elastic.co/c/logstash discussion forum.
|
43
|
+
|
44
|
+
## Developing
|
45
|
+
|
46
|
+
### 1. Plugin Developement and Testing
|
47
|
+
|
48
|
+
#### Code
|
49
|
+
- To get started, you'll need JRuby with the Bundler gem installed.
|
50
|
+
|
51
|
+
- Create a new plugin or clone and existing from the GitHub [logstash-plugins](https://github.com/logstash-plugins) organization. We also provide [example plugins](https://github.com/logstash-plugins?query=example).
|
52
|
+
|
53
|
+
- Install dependencies
|
54
|
+
```sh
|
55
|
+
bundle install
|
56
|
+
```
|
57
|
+
|
58
|
+
#### Test
|
59
|
+
|
60
|
+
- Update your dependencies
|
61
|
+
|
62
|
+
```sh
|
63
|
+
bundle install
|
64
|
+
```
|
65
|
+
|
66
|
+
- Run tests
|
67
|
+
|
68
|
+
```sh
|
69
|
+
bundle exec rspec
|
70
|
+
```
|
71
|
+
|
72
|
+
### 2. Running your unpublished Plugin in Logstash
|
73
|
+
|
74
|
+
#### 2.1 Run in a local Logstash clone
|
75
|
+
|
76
|
+
- Edit Logstash `Gemfile` and add the local plugin path, for example:
|
77
|
+
```ruby
|
78
|
+
gem "logstash-filter-awesome", :path => "/your/local/logstash-filter-awesome"
|
79
|
+
```
|
80
|
+
- Install plugin
|
81
|
+
```sh
|
82
|
+
bin/logstash-plugin install --no-verify
|
83
|
+
```
|
84
|
+
- Run Logstash with your plugin
|
85
|
+
```sh
|
86
|
+
bin/logstash -e 'filter {awesome {}}'
|
87
|
+
```
|
88
|
+
At this point any modifications to the plugin code will be applied to this local Logstash setup. After modifying the plugin, simply rerun Logstash.
|
89
|
+
|
90
|
+
#### 2.2 Run in an installed Logstash
|
91
|
+
|
92
|
+
You can use the same **2.1** method to run your plugin in an installed Logstash by editing its `Gemfile` and pointing the `:path` to your local plugin development directory or you can build the gem and install it using:
|
93
|
+
|
94
|
+
- Build your plugin gem
|
95
|
+
```sh
|
96
|
+
gem build logstash-filter-awesome.gemspec
|
97
|
+
```
|
98
|
+
- Install the plugin from the Logstash home
|
99
|
+
```sh
|
100
|
+
bin/logstash-plugin install /your/local/plugin/logstash-filter-awesome.gem
|
101
|
+
```
|
102
|
+
- Start Logstash and proceed to test the plugin
|
103
|
+
|
104
|
+
## Contributing
|
105
|
+
|
106
|
+
All contributions are welcome: ideas, patches, documentation, bug reports, complaints, and even something you drew up on a napkin.
|
107
|
+
|
108
|
+
Programming is not a required skill. Whatever you've seen about open source and maintainers or community members saying "send patches or die" - you will not see that here.
|
109
|
+
|
110
|
+
It is more important to the community that you are able to contribute.
|
111
|
+
|
112
|
+
For more information about contributing, see the [CONTRIBUTING](https://github.com/elastic/logstash/blob/master/CONTRIBUTING.md) file.
|
@@ -0,0 +1,465 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
require "logstash/inputs/base"
|
3
|
+
require "logstash/namespace"
|
4
|
+
require "logstash/plugin_mixins/aws_config"
|
5
|
+
require "time"
|
6
|
+
require "tmpdir"
|
7
|
+
require "stud/interval"
|
8
|
+
require "stud/temporary"
|
9
|
+
require "aws-sdk"
|
10
|
+
require "logstash/inputs/s3/patch"
|
11
|
+
|
12
|
+
require 'java'
|
13
|
+
java_import java.io.InputStream
|
14
|
+
java_import java.io.InputStreamReader
|
15
|
+
java_import java.io.FileInputStream
|
16
|
+
java_import java.io.BufferedReader
|
17
|
+
java_import java.util.zip.GZIPInputStream
|
18
|
+
java_import java.util.zip.ZipException
|
19
|
+
|
20
|
+
Aws.eager_autoload!
|
21
|
+
# Stream events from files from a S3 bucket.
|
22
|
+
#
|
23
|
+
# Each line from each file generates an event.
|
24
|
+
# Files ending in `.gz` are handled as gzip'ed files.
|
25
|
+
class LogStash::Inputs::PacketloopS3 < LogStash::Inputs::Base
|
26
|
+
include LogStash::PluginMixins::AwsConfig::V2
|
27
|
+
|
28
|
+
config_name "packetloop_s3"
|
29
|
+
|
30
|
+
default :codec, "plain"
|
31
|
+
|
32
|
+
# The name of the S3 bucket.
|
33
|
+
config :bucket, :validate => :string, :required => true
|
34
|
+
|
35
|
+
# If specified, the prefix of filenames in the bucket must match (not a regexp)
|
36
|
+
config :prefix, :validate => :string, :default => nil
|
37
|
+
|
38
|
+
config :additional_settings, :validate => :hash, :default => {}
|
39
|
+
|
40
|
+
# The path to use for writing state. The state stored by this plugin is
|
41
|
+
# a memory of files already processed by this plugin.
|
42
|
+
#
|
43
|
+
# If not specified, the default is in `{path.data}/plugins/inputs/s3/...`
|
44
|
+
#
|
45
|
+
# Should be a path with filename not just a directory.
|
46
|
+
config :sincedb_path, :validate => :string, :default => nil
|
47
|
+
|
48
|
+
# Name of a S3 bucket to backup processed files to.
|
49
|
+
config :backup_to_bucket, :validate => :string, :default => nil
|
50
|
+
|
51
|
+
# Append a prefix to the key (full path including file name in s3) after processing.
|
52
|
+
# If backing up to another (or the same) bucket, this effectively lets you
|
53
|
+
# choose a new 'folder' to place the files in
|
54
|
+
config :backup_add_prefix, :validate => :string, :default => nil
|
55
|
+
|
56
|
+
# Path of a local directory to backup processed files to.
|
57
|
+
config :backup_to_dir, :validate => :string, :default => nil
|
58
|
+
|
59
|
+
# Whether to delete processed files from the original bucket.
|
60
|
+
config :delete, :validate => :boolean, :default => false
|
61
|
+
|
62
|
+
# Interval to wait between to check the file list again after a run is finished.
|
63
|
+
# Value is in seconds.
|
64
|
+
config :interval, :validate => :number, :default => 60
|
65
|
+
|
66
|
+
# Whether to watch for new files with the interval.
|
67
|
+
# If false, overrides any interval and only lists the s3 bucket once.
|
68
|
+
config :watch_for_new_files, :validate => :boolean, :default => true
|
69
|
+
|
70
|
+
# Ruby style regexp of keys to exclude from the bucket
|
71
|
+
config :exclude_pattern, :validate => :string, :default => nil
|
72
|
+
|
73
|
+
# Set the directory where logstash will store the tmp files before processing them.
|
74
|
+
# default to the current OS temporary directory in linux /tmp/logstash
|
75
|
+
config :temporary_directory, :validate => :string, :default => File.join(Dir.tmpdir, "logstash")
|
76
|
+
|
77
|
+
# Whether or not to include the S3 object's properties (last_modified, content_type, metadata)
|
78
|
+
# into each Event at [@metadata][s3]. Regardless of this setting, [@metdata][s3][key] will always
|
79
|
+
# be present.
|
80
|
+
config :include_object_properties, :validate => :boolean, :default => false
|
81
|
+
|
82
|
+
# There are instances where files is gzip compressed such as Cloudwatch logs and checking for
|
83
|
+
# file extensions to determine gzip compress or not is not enough. This version of input plugin
|
84
|
+
# is meant to be temporary until upstream https://github.com/logstash-plugins/logstash-input-s3/issues/165 PR
|
85
|
+
# has been merged.
|
86
|
+
config :force_gzip_decompress, :validate => :boolean, :default => false
|
87
|
+
|
88
|
+
public
|
89
|
+
def register
|
90
|
+
require "fileutils"
|
91
|
+
require "digest/md5"
|
92
|
+
require "aws-sdk-resources"
|
93
|
+
|
94
|
+
@logger.info("Registering s3 input", :bucket => @bucket, :region => @region)
|
95
|
+
|
96
|
+
s3 = get_s3object
|
97
|
+
|
98
|
+
@s3bucket = s3.bucket(@bucket)
|
99
|
+
|
100
|
+
unless @backup_to_bucket.nil?
|
101
|
+
@backup_bucket = s3.bucket(@backup_to_bucket)
|
102
|
+
begin
|
103
|
+
s3.client.head_bucket({ :bucket => @backup_to_bucket})
|
104
|
+
rescue Aws::S3::Errors::NoSuchBucket
|
105
|
+
s3.create_bucket({ :bucket => @backup_to_bucket})
|
106
|
+
end
|
107
|
+
end
|
108
|
+
|
109
|
+
unless @backup_to_dir.nil?
|
110
|
+
Dir.mkdir(@backup_to_dir, 0700) unless File.exists?(@backup_to_dir)
|
111
|
+
end
|
112
|
+
|
113
|
+
FileUtils.mkdir_p(@temporary_directory) unless Dir.exist?(@temporary_directory)
|
114
|
+
|
115
|
+
if !@watch_for_new_files && original_params.include?('interval')
|
116
|
+
logger.warn("`watch_for_new_files` has been disabled; `interval` directive will be ignored.")
|
117
|
+
end
|
118
|
+
end
|
119
|
+
|
120
|
+
public
|
121
|
+
def run(queue)
|
122
|
+
@current_thread = Thread.current
|
123
|
+
Stud.interval(@interval) do
|
124
|
+
process_files(queue)
|
125
|
+
stop unless @watch_for_new_files
|
126
|
+
end
|
127
|
+
end # def run
|
128
|
+
|
129
|
+
public
|
130
|
+
def list_new_files
|
131
|
+
objects = {}
|
132
|
+
found = false
|
133
|
+
begin
|
134
|
+
@s3bucket.objects(:prefix => @prefix).each do |log|
|
135
|
+
found = true
|
136
|
+
@logger.debug("S3 input: Found key", :key => log.key)
|
137
|
+
if ignore_filename?(log.key)
|
138
|
+
@logger.debug('S3 input: Ignoring', :key => log.key)
|
139
|
+
elsif log.content_length <= 0
|
140
|
+
@logger.debug('S3 Input: Object Zero Length', :key => log.key)
|
141
|
+
elsif !sincedb.newer?(log.last_modified)
|
142
|
+
@logger.debug('S3 Input: Object Not Modified', :key => log.key)
|
143
|
+
elsif log.storage_class.start_with?('GLACIER')
|
144
|
+
@logger.debug('S3 Input: Object Archived to Glacier', :key => log.key)
|
145
|
+
else
|
146
|
+
objects[log.key] = log.last_modified
|
147
|
+
@logger.debug("S3 input: Adding to objects[]", :key => log.key)
|
148
|
+
@logger.debug("objects[] length is: ", :length => objects.length)
|
149
|
+
end
|
150
|
+
end
|
151
|
+
@logger.info('S3 input: No files found in bucket', :prefix => prefix) unless found
|
152
|
+
rescue Aws::Errors::ServiceError => e
|
153
|
+
@logger.error("S3 input: Unable to list objects in bucket", :prefix => prefix, :message => e.message)
|
154
|
+
end
|
155
|
+
objects.keys.sort {|a,b| objects[a] <=> objects[b]}
|
156
|
+
end # def fetch_new_files
|
157
|
+
|
158
|
+
public
|
159
|
+
def backup_to_bucket(object)
|
160
|
+
unless @backup_to_bucket.nil?
|
161
|
+
backup_key = "#{@backup_add_prefix}#{object.key}"
|
162
|
+
@backup_bucket.object(backup_key).copy_from(:copy_source => "#{object.bucket_name}/#{object.key}")
|
163
|
+
if @delete
|
164
|
+
object.delete()
|
165
|
+
end
|
166
|
+
end
|
167
|
+
end
|
168
|
+
|
169
|
+
public
|
170
|
+
def backup_to_dir(filename)
|
171
|
+
unless @backup_to_dir.nil?
|
172
|
+
FileUtils.cp(filename, @backup_to_dir)
|
173
|
+
end
|
174
|
+
end
|
175
|
+
|
176
|
+
public
|
177
|
+
def process_files(queue)
|
178
|
+
objects = list_new_files
|
179
|
+
|
180
|
+
objects.each do |key|
|
181
|
+
if stop?
|
182
|
+
break
|
183
|
+
else
|
184
|
+
@logger.debug("S3 input processing", :bucket => @bucket, :key => key)
|
185
|
+
process_log(queue, key)
|
186
|
+
end
|
187
|
+
end
|
188
|
+
end # def process_files
|
189
|
+
|
190
|
+
public
|
191
|
+
def stop
|
192
|
+
# @current_thread is initialized in the `#run` method,
|
193
|
+
# this variable is needed because the `#stop` is a called in another thread
|
194
|
+
# than the `#run` method and requiring us to call stop! with a explicit thread.
|
195
|
+
Stud.stop!(@current_thread)
|
196
|
+
end
|
197
|
+
|
198
|
+
private
|
199
|
+
|
200
|
+
# Read the content of the local file
|
201
|
+
#
|
202
|
+
# @param [Queue] Where to push the event
|
203
|
+
# @param [String] Which file to read from
|
204
|
+
# @param [S3Object] Source s3 object
|
205
|
+
# @return [Boolean] True if the file was completely read, false otherwise.
|
206
|
+
def process_local_log(queue, filename, object)
|
207
|
+
@logger.debug('Processing file', :filename => filename)
|
208
|
+
metadata = {}
|
209
|
+
# Currently codecs operates on bytes instead of stream.
|
210
|
+
# So all IO stuff: decompression, reading need to be done in the actual
|
211
|
+
# input and send as bytes to the codecs.
|
212
|
+
read_file(filename) do |line|
|
213
|
+
if stop?
|
214
|
+
@logger.warn("Logstash S3 input, stop reading in the middle of the file, we will read it again when logstash is started")
|
215
|
+
return false
|
216
|
+
end
|
217
|
+
|
218
|
+
@codec.decode(line) do |event|
|
219
|
+
# We are making an assumption concerning cloudfront
|
220
|
+
# log format, the user will use the plain or the line codec
|
221
|
+
# and the message key will represent the actual line content.
|
222
|
+
# If the event is only metadata the event will be drop.
|
223
|
+
# This was the behavior of the pre 1.5 plugin.
|
224
|
+
#
|
225
|
+
# The line need to go through the codecs to replace
|
226
|
+
# unknown bytes in the log stream before doing a regexp match or
|
227
|
+
# you will get a `Error: invalid byte sequence in UTF-8'
|
228
|
+
if event_is_metadata?(event)
|
229
|
+
@logger.debug('Event is metadata, updating the current cloudfront metadata', :event => event)
|
230
|
+
update_metadata(metadata, event)
|
231
|
+
else
|
232
|
+
decorate(event)
|
233
|
+
|
234
|
+
event.set("cloudfront_version", metadata[:cloudfront_version]) unless metadata[:cloudfront_version].nil?
|
235
|
+
event.set("cloudfront_fields", metadata[:cloudfront_fields]) unless metadata[:cloudfront_fields].nil?
|
236
|
+
|
237
|
+
if @include_object_properties
|
238
|
+
event.set("[@metadata][s3]", object.data.to_h)
|
239
|
+
else
|
240
|
+
event.set("[@metadata][s3]", {})
|
241
|
+
end
|
242
|
+
|
243
|
+
event.set("[@metadata][s3][key]", object.key)
|
244
|
+
|
245
|
+
queue << event
|
246
|
+
end
|
247
|
+
end
|
248
|
+
end
|
249
|
+
# #ensure any stateful codecs (such as multi-line ) are flushed to the queue
|
250
|
+
@codec.flush do |event|
|
251
|
+
queue << event
|
252
|
+
end
|
253
|
+
|
254
|
+
return true
|
255
|
+
end # def process_local_log
|
256
|
+
|
257
|
+
private
|
258
|
+
def event_is_metadata?(event)
|
259
|
+
return false unless event.get("message").class == String
|
260
|
+
line = event.get("message")
|
261
|
+
version_metadata?(line) || fields_metadata?(line)
|
262
|
+
end
|
263
|
+
|
264
|
+
private
|
265
|
+
def version_metadata?(line)
|
266
|
+
line.start_with?('#Version: ')
|
267
|
+
end
|
268
|
+
|
269
|
+
private
|
270
|
+
def fields_metadata?(line)
|
271
|
+
line.start_with?('#Fields: ')
|
272
|
+
end
|
273
|
+
|
274
|
+
private
|
275
|
+
def update_metadata(metadata, event)
|
276
|
+
line = event.get('message').strip
|
277
|
+
|
278
|
+
if version_metadata?(line)
|
279
|
+
metadata[:cloudfront_version] = line.split(/#Version: (.+)/).last
|
280
|
+
end
|
281
|
+
|
282
|
+
if fields_metadata?(line)
|
283
|
+
metadata[:cloudfront_fields] = line.split(/#Fields: (.+)/).last
|
284
|
+
end
|
285
|
+
end
|
286
|
+
|
287
|
+
private
|
288
|
+
def read_file(filename, &block)
|
289
|
+
@force_gzip_decompress == true ? read_gzip_file(filename, block) : read_plain_file(filename, block)
|
290
|
+
rescue => e
|
291
|
+
# skip any broken file
|
292
|
+
@logger.error("Failed to read the file. Skip processing.", :filename => filename, :exception => e.message)
|
293
|
+
end
|
294
|
+
|
295
|
+
def read_plain_file(filename, block)
|
296
|
+
File.open(filename, 'rb') do |file|
|
297
|
+
file.each(&block)
|
298
|
+
end
|
299
|
+
end
|
300
|
+
|
301
|
+
private
|
302
|
+
def read_gzip_file(filename, block)
|
303
|
+
file_stream = FileInputStream.new(filename)
|
304
|
+
gzip_stream = GZIPInputStream.new(file_stream)
|
305
|
+
decoder = InputStreamReader.new(gzip_stream, "UTF-8")
|
306
|
+
buffered = BufferedReader.new(decoder)
|
307
|
+
|
308
|
+
while (line = buffered.readLine())
|
309
|
+
block.call(line)
|
310
|
+
end
|
311
|
+
ensure
|
312
|
+
buffered.close unless buffered.nil?
|
313
|
+
decoder.close unless decoder.nil?
|
314
|
+
gzip_stream.close unless gzip_stream.nil?
|
315
|
+
file_stream.close unless file_stream.nil?
|
316
|
+
end
|
317
|
+
|
318
|
+
private
|
319
|
+
def sincedb
|
320
|
+
@sincedb ||= if @sincedb_path.nil?
|
321
|
+
@logger.info("Using default generated file for the sincedb", :filename => sincedb_file)
|
322
|
+
SinceDB::File.new(sincedb_file)
|
323
|
+
else
|
324
|
+
@logger.info("Using the provided sincedb_path",
|
325
|
+
:sincedb_path => @sincedb_path)
|
326
|
+
SinceDB::File.new(@sincedb_path)
|
327
|
+
end
|
328
|
+
end
|
329
|
+
|
330
|
+
private
|
331
|
+
def sincedb_file
|
332
|
+
digest = Digest::MD5.hexdigest("#{@bucket}+#{@prefix}")
|
333
|
+
dir = File.join(LogStash::SETTINGS.get_value("path.data"), "plugins", "inputs", "s3")
|
334
|
+
FileUtils::mkdir_p(dir)
|
335
|
+
path = File.join(dir, "sincedb_#{digest}")
|
336
|
+
|
337
|
+
# Migrate old default sincedb path to new one.
|
338
|
+
if ENV["HOME"]
|
339
|
+
# This is the old file path including the old digest mechanism.
|
340
|
+
# It remains as a way to automatically upgrade users with the old default ($HOME)
|
341
|
+
# to the new default (path.data)
|
342
|
+
old = File.join(ENV["HOME"], ".sincedb_" + Digest::MD5.hexdigest("#{@bucket}+#{@prefix}"))
|
343
|
+
if File.exist?(old)
|
344
|
+
logger.info("Migrating old sincedb in $HOME to {path.data}")
|
345
|
+
FileUtils.mv(old, path)
|
346
|
+
end
|
347
|
+
end
|
348
|
+
|
349
|
+
path
|
350
|
+
end
|
351
|
+
|
352
|
+
def symbolized_settings
|
353
|
+
@symbolized_settings ||= symbolize(@additional_settings)
|
354
|
+
end
|
355
|
+
|
356
|
+
def symbolize(hash)
|
357
|
+
return hash unless hash.is_a?(Hash)
|
358
|
+
symbolized = {}
|
359
|
+
hash.each { |key, value| symbolized[key.to_sym] = symbolize(value) }
|
360
|
+
symbolized
|
361
|
+
end
|
362
|
+
|
363
|
+
private
|
364
|
+
def old_sincedb_file
|
365
|
+
end
|
366
|
+
|
367
|
+
private
|
368
|
+
def ignore_filename?(filename)
|
369
|
+
if @prefix == filename
|
370
|
+
return true
|
371
|
+
elsif filename.end_with?("/")
|
372
|
+
return true
|
373
|
+
elsif (@backup_add_prefix && @backup_to_bucket == @bucket && filename =~ /^#{backup_add_prefix}/)
|
374
|
+
return true
|
375
|
+
elsif @exclude_pattern.nil?
|
376
|
+
return false
|
377
|
+
elsif filename =~ Regexp.new(@exclude_pattern)
|
378
|
+
return true
|
379
|
+
else
|
380
|
+
return false
|
381
|
+
end
|
382
|
+
end
|
383
|
+
|
384
|
+
private
|
385
|
+
def process_log(queue, key)
|
386
|
+
object = @s3bucket.object(key)
|
387
|
+
|
388
|
+
filename = File.join(temporary_directory, File.basename(key))
|
389
|
+
if download_remote_file(object, filename)
|
390
|
+
if process_local_log(queue, filename, object)
|
391
|
+
lastmod = object.last_modified
|
392
|
+
backup_to_bucket(object)
|
393
|
+
backup_to_dir(filename)
|
394
|
+
delete_file_from_bucket(object)
|
395
|
+
FileUtils.remove_entry_secure(filename, true)
|
396
|
+
sincedb.write(lastmod)
|
397
|
+
end
|
398
|
+
else
|
399
|
+
FileUtils.remove_entry_secure(filename, true)
|
400
|
+
end
|
401
|
+
end
|
402
|
+
|
403
|
+
private
|
404
|
+
# Stream the remove file to the local disk
|
405
|
+
#
|
406
|
+
# @param [S3Object] Reference to the remove S3 objec to download
|
407
|
+
# @param [String] The Temporary filename to stream to.
|
408
|
+
# @return [Boolean] True if the file was completely downloaded
|
409
|
+
def download_remote_file(remote_object, local_filename)
|
410
|
+
completed = false
|
411
|
+
@logger.debug("S3 input: Download remote file", :remote_key => remote_object.key, :local_filename => local_filename)
|
412
|
+
File.open(local_filename, 'wb') do |s3file|
|
413
|
+
return completed if stop?
|
414
|
+
begin
|
415
|
+
remote_object.get(:response_target => s3file)
|
416
|
+
completed = true
|
417
|
+
rescue Aws::Errors::ServiceError => e
|
418
|
+
@logger.warn("S3 input: Unable to download remote file", :remote_key => remote_object.key, :message => e.message)
|
419
|
+
end
|
420
|
+
end
|
421
|
+
completed
|
422
|
+
end
|
423
|
+
|
424
|
+
private
|
425
|
+
def delete_file_from_bucket(object)
|
426
|
+
if @delete and @backup_to_bucket.nil?
|
427
|
+
object.delete()
|
428
|
+
end
|
429
|
+
end
|
430
|
+
|
431
|
+
private
|
432
|
+
def get_s3object
|
433
|
+
options = symbolized_settings.merge(aws_options_hash || {})
|
434
|
+
s3 = Aws::S3::Resource.new(options)
|
435
|
+
end
|
436
|
+
|
437
|
+
private
|
438
|
+
module SinceDB
|
439
|
+
class File
|
440
|
+
def initialize(file)
|
441
|
+
@sincedb_path = file
|
442
|
+
end
|
443
|
+
|
444
|
+
def newer?(date)
|
445
|
+
date > read
|
446
|
+
end
|
447
|
+
|
448
|
+
def read
|
449
|
+
if ::File.exists?(@sincedb_path)
|
450
|
+
content = ::File.read(@sincedb_path).chomp.strip
|
451
|
+
# If the file was created but we didn't have the time to write to it
|
452
|
+
return content.empty? ? Time.new(0) : Time.parse(content)
|
453
|
+
else
|
454
|
+
return Time.new(0)
|
455
|
+
end
|
456
|
+
end
|
457
|
+
|
458
|
+
def write(since = nil)
|
459
|
+
since = Time.now() if since.nil?
|
460
|
+
::File.open(@sincedb_path, 'w') { |file| file.write(since.to_s) }
|
461
|
+
end
|
462
|
+
end
|
463
|
+
end
|
464
|
+
end # class LogStash::Inputs::S3
|
465
|
+
|
@@ -0,0 +1,25 @@
|
|
1
|
+
Gem::Specification.new do |s|
|
2
|
+
s.name = 'logstash-input-packetloop_s3'
|
3
|
+
s.version = '0.1.0'
|
4
|
+
s.licenses = ['Apache-2.0']
|
5
|
+
s.summary = 'A fork of Logstash S3 input that contains a temporary fix with processing Cloudwatch logs from S3 bucket.'
|
6
|
+
s.description = 'A fork of Logstash S3 input that contains a temporary fix with processing Cloudwatch logs from S3 bucket until https://github.com/logstash-plugins/logstash-input-s3/issues/165 PR is merged'
|
7
|
+
s.homepage = 'https://github.com/packetloop/logstash-input-packetloop_s3'
|
8
|
+
s.authors = ['Mayhem']
|
9
|
+
s.email = 'mayhem@arbor.net'
|
10
|
+
s.require_paths = ['lib']
|
11
|
+
|
12
|
+
# Files
|
13
|
+
s.files = Dir['lib/**/*','spec/**/*','vendor/**/*','*.gemspec','*.md','CONTRIBUTORS','Gemfile','LICENSE','NOTICE.TXT']
|
14
|
+
# Tests
|
15
|
+
s.test_files = s.files.grep(%r{^(test|spec|features)/})
|
16
|
+
|
17
|
+
# Special flag to let us know this is actually a logstash plugin
|
18
|
+
s.metadata = { "logstash_plugin" => "true", "logstash_group" => "input" }
|
19
|
+
|
20
|
+
# Gem dependencies
|
21
|
+
s.add_runtime_dependency "logstash-core-plugin-api", "~> 2.0"
|
22
|
+
s.add_runtime_dependency 'logstash-codec-plain'
|
23
|
+
s.add_runtime_dependency 'stud', '>= 0.0.22'
|
24
|
+
s.add_development_dependency 'logstash-devutils', '>= 0.0.16'
|
25
|
+
end
|
@@ -0,0 +1,11 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
require "logstash/devutils/rspec/spec_helper"
|
3
|
+
require "logstash/inputs/packetloop_s3_cloudwatch"
|
4
|
+
|
5
|
+
describe LogStash::Inputs::PacketloopS3 do
|
6
|
+
|
7
|
+
it_behaves_like "an interruptible input plugin" do
|
8
|
+
let(:config) { { "interval" => 100 } }
|
9
|
+
end
|
10
|
+
|
11
|
+
end
|
metadata
ADDED
@@ -0,0 +1,114 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: logstash-input-packetloop_s3
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.1.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Mayhem
|
8
|
+
autorequire:
|
9
|
+
bindir: bin
|
10
|
+
cert_chain: []
|
11
|
+
date: 2019-01-14 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: logstash-core-plugin-api
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - "~>"
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '2.0'
|
20
|
+
type: :runtime
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - "~>"
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '2.0'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: logstash-codec-plain
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - ">="
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '0'
|
34
|
+
type: :runtime
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - ">="
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '0'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: stud
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - ">="
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: 0.0.22
|
48
|
+
type: :runtime
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - ">="
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: 0.0.22
|
55
|
+
- !ruby/object:Gem::Dependency
|
56
|
+
name: logstash-devutils
|
57
|
+
requirement: !ruby/object:Gem::Requirement
|
58
|
+
requirements:
|
59
|
+
- - ">="
|
60
|
+
- !ruby/object:Gem::Version
|
61
|
+
version: 0.0.16
|
62
|
+
type: :development
|
63
|
+
prerelease: false
|
64
|
+
version_requirements: !ruby/object:Gem::Requirement
|
65
|
+
requirements:
|
66
|
+
- - ">="
|
67
|
+
- !ruby/object:Gem::Version
|
68
|
+
version: 0.0.16
|
69
|
+
description: A fork of Logstash S3 input that contains a temporary fix with processing
|
70
|
+
Cloudwatch logs from S3 bucket until https://github.com/logstash-plugins/logstash-input-s3/issues/165
|
71
|
+
PR is merged
|
72
|
+
email: mayhem@arbor.net
|
73
|
+
executables: []
|
74
|
+
extensions: []
|
75
|
+
extra_rdoc_files: []
|
76
|
+
files:
|
77
|
+
- CHANGELOG.md
|
78
|
+
- CONTRIBUTORS
|
79
|
+
- DEVELOPER.md
|
80
|
+
- Gemfile
|
81
|
+
- LICENSE
|
82
|
+
- README.md
|
83
|
+
- lib/logstash/inputs/packetloop_s3.rb
|
84
|
+
- logstash-input-packetloop_s3.gemspec
|
85
|
+
- spec/inputs/packetloop_s3_spec.rb
|
86
|
+
homepage: https://github.com/packetloop/logstash-input-packetloop_s3
|
87
|
+
licenses:
|
88
|
+
- Apache-2.0
|
89
|
+
metadata:
|
90
|
+
logstash_plugin: 'true'
|
91
|
+
logstash_group: input
|
92
|
+
post_install_message:
|
93
|
+
rdoc_options: []
|
94
|
+
require_paths:
|
95
|
+
- lib
|
96
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
97
|
+
requirements:
|
98
|
+
- - ">="
|
99
|
+
- !ruby/object:Gem::Version
|
100
|
+
version: '0'
|
101
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
102
|
+
requirements:
|
103
|
+
- - ">="
|
104
|
+
- !ruby/object:Gem::Version
|
105
|
+
version: '0'
|
106
|
+
requirements: []
|
107
|
+
rubyforge_project:
|
108
|
+
rubygems_version: 2.6.11
|
109
|
+
signing_key:
|
110
|
+
specification_version: 4
|
111
|
+
summary: A fork of Logstash S3 input that contains a temporary fix with processing
|
112
|
+
Cloudwatch logs from S3 bucket.
|
113
|
+
test_files:
|
114
|
+
- spec/inputs/packetloop_s3_spec.rb
|