logstash-input-crowdstrike_fdr 2.1.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: c8e0b47a277210807baaf165fa9a0b1acd18fa52
4
+ data.tar.gz: 553647a85d8c20fd05c74bf530fee9df2b7d053f
5
+ SHA512:
6
+ metadata.gz: 726387033c60271642d500faa8ec4cbf0d039a71413036615bc09bcb64582966b4bdddee342df2c6c8703822eacba5751ed10028c48fada1eacfb2337de142ab
7
+ data.tar.gz: e47d188d9294cfb83c2c28306cd63cddbf03351a8c918826701ada77377271dfb1729f6468ae1db4978e7ef72a185a8571d680b4871126e02c2bc74482074cec
data/CHANGELOG.md ADDED
@@ -0,0 +1,141 @@
1
+ ##2.1.2
2
+ - FEATURE: Now it´s possible to use queue urls and names.
3
+ - FEATURE: Add sqs long polling config parameter: sqs_wait_time_seconds
4
+ - FIX: Valid UTF-8 byte sequences in logs are munged
5
+ - CLEANUP: Remove tests. (as a begin for clean testing)
6
+ ##2.1.1
7
+ - FEATURE: Enable Multiregion Support for included S3 client.
8
+ - Add region by bucket feature
9
+ ##2.1.0
10
+ - FEATURE: Add S3 metadata -> config :include_object_properties
11
+ - FEATURE: Watch for threads in exception state and restart...
12
+ ##2.0.9
13
+ -gzip dectection should return false for files smaller than gzip_signiture_bytes
14
+ ##2.0.8
15
+ -fix nil class error
16
+ ##2.0.7
17
+ -fix gem error
18
+ ##2.0.6
19
+ -fix: fix crash of extender
20
+ ##2.0.5
21
+ -fix: crash on 0 byte file
22
+ -fix: type by folder function
23
+ ##2.0.4
24
+ -fix: type-by-folder repair
25
+ -fix: crash on 0 byte file
26
+ ##2.0.3
27
+ - Increase max parsing time -> drop event if reached
28
+ - watcher thread should raise an error in poller loop if timeout reached.
29
+ - remove some debug logs
30
+
31
+ ##2.0.2
32
+ FIX:
33
+ - Terminate every input line by \n (BufferedReader does not)
34
+ - Wrong input for type folder, leads to empty types
35
+ ##2.0.1
36
+ FIX:
37
+ - Deadlock while message decoding
38
+ - make method stop? public
39
+
40
+ ##2.0.0
41
+ Breaking Changes:
42
+ - s3_key_prefix was never functional and will be removed. Actually only used for metadata.folder backward compatibility.
43
+ config for s3 paths are regex (if not exact match)
44
+ - s3_options_by_bucket substitutes all s3_* options
45
+ We will merge deprecated options into the new structure for one release
46
+ Changes:
47
+ - Refactor plugin structure to be more modular
48
+ - Rework threadding design
49
+ - introduce s3_options_by_bucket to configure settings (e.g aws_options_hash or type)
50
+ ##1.6.1
51
+ - Fix typo in gzip error logging
52
+ ##1.6.0
53
+ - add a test to tmp file deletion
54
+ - revert type folder regex
55
+ ##1.5.9
56
+ - Fix regex for type folder
57
+ ##1.5.8
58
+ - Add some debug and a toggle for delete
59
+ ##1.5.7
60
+ - Remove Debug poutput
61
+ ##1.5.6
62
+ -BugFix
63
+ ##1.5.5
64
+ - Memo to me: better testing :-) Fix msg -> message
65
+ ##1.5.4
66
+ - BugFix
67
+ ##1.5.3
68
+ - Try to fix requeue Problem
69
+ ## 1.5.2
70
+ - BugFix Set Metadata bucket,key,folder
71
+ - Feature: Possibility to fall back to old threadding model by unset consumer_threads
72
+ ## 1.5.1
73
+ - BugFix: rescue all AWS::S3 errors
74
+ ## 1.5.0
75
+ - Feature: Use own magicbyte detector (small&fast)
76
+ ## 1.4.9
77
+ - Feature: Detect filetype with MagicByte
78
+ ## 1.4.8
79
+ - Bufix: CF Metadata events Bug #7
80
+ - Feature: use aws-role for s3 client connection.
81
+ ## 1.4.7
82
+ Remove from rubygems.org
83
+ ## 1.4.6
84
+ - BugFix: jRuby > 2 : No return from block
85
+ - BugFix: No exit on gzip error
86
+ ## 1.4.5
87
+ - BugFix: undeclared var in rescue
88
+ ## 1.4.4
89
+ - Feature: make set_codec_by_folder match as regex
90
+ e.g.: set_codec_by_folder => { ".*-ELB-logs" => "plain"}
91
+ ## 1.4.3
92
+ - Fix: skip_delete on S3::Errors::AccessDenied
93
+ - Feature: codec per s3 folder
94
+ - Feature: Alpha phase: different credentials for s3 / default credentials for sqs
95
+ - Feature: Find files folder.
96
+ ## 1.4.2
97
+ - Fix: Thread shutdown method should kill in case of wakeup fails
98
+ ## 1.4.1
99
+ - Fix: Last event in file not decorated
100
+ - Adjust metadata namings
101
+ - Event decoration in private method now.
102
+ ## 1.4.0
103
+ - Filehandling rewritten THX to logstash-input-s3 for inspiration
104
+ - Improve performance of gzip decoding by 10x by using Java's Zlib
105
+ - Added multithreading via config Use: consumer_threads in config
106
+ ## 1.2.0
107
+ - Add codec suggestion by content-type
108
+ - enrich metadata
109
+ - Fix some bugs
110
+ ## 1.1.9
111
+ - Add config for s3 folder prefix, auto codec and auto type
112
+ ## 1.1.8
113
+ - Add config switch for delivery with or without SNS
114
+ ## 1.1.6
115
+ - Fix a nil exception in message parsing
116
+ ## 1.1.5
117
+ - Fix loglevel for some debug messages
118
+ ## 1.1.4
119
+ - Add Account-ID to config
120
+ ## 1.1.2
121
+ - Fix a Bug in the S3 Key generation
122
+ - Enable shipping throug SNS Topic (needs another toJSON)
123
+ ## 1.1.1
124
+ - Added the ability to remove objects from S3 after processing.
125
+ - Workaround an issue with the Ruby autoload that causes "uninitialized constant `Aws::Client::Errors`" errors.
126
+
127
+ ## 1.1.0
128
+ - Logstash 5 compatibility
129
+
130
+ ## 1.0.3
131
+ - added some metadata to the event (bucket and object name as commited by joshuaspence)
132
+ - also try to unzip files ending with ".gz" (ALB logs are zipped but not marked with proper Content-Encoding)
133
+
134
+ ## 1.0.2
135
+ - fix for broken UTF-8 (so we won't lose a whole s3 log file because of a single invalid line, ruby's split will die on those)
136
+
137
+ ## 1.0.1
138
+ - same (because of screwed up rubygems.org release)
139
+
140
+ ## 1.0.0
141
+ - Initial Release
data/CONTRIBUTORS ADDED
@@ -0,0 +1,14 @@
1
+ The following is a list of people who have contributed ideas, code, bug
2
+ reports, or in general have helped logstash along its way.
3
+
4
+ Contributors:
5
+ * cherweg(this fork + some bugfixes)
6
+ * holgerjenczewski1007 (thank you for the refactoring)
7
+ * joshuaspence (event metadata)
8
+ * Heiko-san (initial contributor)
9
+ * logstash-input-sqs plugin as code base
10
+
11
+ Note: If you've sent us patches, bug reports, or otherwise contributed to
12
+ Logstash, and you aren't on the list above and want to be, please let us know
13
+ and we'll make sure you're here. Contributions from folks like you are what make
14
+ open source awesome.
data/Gemfile ADDED
@@ -0,0 +1,11 @@
1
+ source 'https://rubygems.org'
2
+
3
+ gemspec
4
+
5
+ logstash_path = ENV["LOGSTASH_PATH"] || "../../logstash"
6
+ use_logstash_source = ENV["LOGSTASH_SOURCE"] && ENV["LOGSTASH_SOURCE"].to_s == "1"
7
+
8
+ if Dir.exist?(logstash_path) && use_logstash_source
9
+ gem 'logstash-core', :path => "#{logstash_path}/logstash-core"
10
+ gem 'logstash-core-plugin-api', :path => "#{logstash_path}/logstash-core-plugin-api"
11
+ end
data/LICENSE ADDED
@@ -0,0 +1,13 @@
1
+ Copyright (c) 2012–2015 Elasticsearch <http://www.elastic.co>
2
+
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+
7
+ http://www.apache.org/licenses/LICENSE-2.0
8
+
9
+ Unless required by applicable law or agreed to in writing, software
10
+ distributed under the License is distributed on an "AS IS" BASIS,
11
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ See the License for the specific language governing permissions and
13
+ limitations under the License.
data/NOTICE.TXT ADDED
@@ -0,0 +1,5 @@
1
+ Elasticsearch
2
+ Copyright 2012-2015 Elasticsearch
3
+
4
+ This product includes software developed by The Apache Software
5
+ Foundation (http://www.apache.org/).
data/README.md ADDED
@@ -0,0 +1,147 @@
1
+ # Logstash Plugin
2
+
3
+ This is a plugin for [Logstash](https://github.com/elastic/logstash).
4
+
5
+ It is fully free and fully open source. The license is Apache 2.0.
6
+
7
+ ## Documentation
8
+
9
+ Get logs from AWS s3 buckets as issued by an object-created event via sqs.
10
+
11
+ This plugin is based on the logstash-input-sqs plugin but doesn't log the sqs event itself.
12
+ Instead it assumes, that the event is an s3 object-created event and will then download
13
+ and process the given file.
14
+
15
+ Some issues of logstash-input-sqs, like logstash not shutting down properly, have been
16
+ fixed for this plugin.
17
+
18
+ In contrast to logstash-input-sqs this plugin uses the "Receive Message Wait Time"
19
+ configured for the sqs queue in question, a good value will be something like 10 seconds
20
+ to ensure a reasonable shutdown time of logstash.
21
+ Also use a "Default Visibility Timeout" that is high enough for log files to be downloaded
22
+ and processed (I think a good value should be 5-10 minutes for most use cases), the plugin will
23
+ avoid removing the event from the queue if the associated log file couldn't be correctly
24
+ passed to the processing level of logstash (e.g. downloaded content size doesn't match sqs event).
25
+
26
+ This plugin is meant for high availability setups, in contrast to logstash-input-s3 you can safely
27
+ use multiple logstash nodes, since the usage of sqs will ensure that each logfile is processed
28
+ only once and no file will get lost on node failure or downscaling for auto-scaling groups.
29
+ (You should use a "Message Retention Period" >= 4 days for your sqs to ensure you can survive
30
+ a weekend of faulty log file processing)
31
+ The plugin will not delete objects from s3 buckets, so make sure to have a reasonable "Lifecycle"
32
+ configured for your buckets, which should keep the files at least "Message Retention Period" days.
33
+
34
+ A typical setup will contain some s3 buckets containing elb, cloudtrail or other log files.
35
+ These will be configured to send object-created events to a sqs queue, which will be configured
36
+ as the source queue for this plugin.
37
+ (The plugin supports gzipped content if it is marked with "contend-encoding: gzip" as it is the
38
+ case for cloudtrail logs)
39
+
40
+ The logstash node therefore must have sqs permissions + the permissions to download objects
41
+ from the s3 buckets that send events to the queue.
42
+ (If logstash nodes are running on EC2 you should use a ServerRole to provide permissions)
43
+ [source,json]
44
+ {
45
+ "Version": "2012-10-17",
46
+ "Statement": [
47
+ {
48
+ "Effect": "Allow",
49
+ "Action": [
50
+ "sqs:Get*",
51
+ "sqs:List*",
52
+ "sqs:ReceiveMessage",
53
+ "sqs:ChangeMessageVisibility*",
54
+ "sqs:DeleteMessage*"
55
+ ],
56
+ "Resource": [
57
+ "arn:aws:sqs:us-east-1:123456789012:my-elb-log-queue"
58
+ ]
59
+ },
60
+ {
61
+ "Effect": "Allow",
62
+ "Action": [
63
+ "s3:Get*",
64
+ "s3:List*",
65
+ "s3:DeleteObject"
66
+ ],
67
+ "Resource": [
68
+ "arn:aws:s3:::my-elb-logs",
69
+ "arn:aws:s3:::my-elb-logs/*"
70
+ ]
71
+ }
72
+ ]
73
+ }
74
+
75
+ ## Need Help?
76
+
77
+ Need help? Try #logstash on freenode IRC or the https://discuss.elastic.co/c/logstash discussion forum.
78
+
79
+ ## Developing
80
+
81
+ ### 1. Plugin Developement and Testing
82
+
83
+ #### Code
84
+ - To get started, you'll need JRuby with the Bundler gem installed.
85
+
86
+ - Create a new plugin or clone and existing from the GitHub [logstash-plugins](https://github.com/logstash-plugins) organization. We also provide [example plugins](https://github.com/logstash-plugins?query=example).
87
+
88
+ - Install dependencies
89
+ ```sh
90
+ bundle install
91
+ ```
92
+
93
+ #### Test
94
+
95
+ - Update your dependencies
96
+
97
+ ```sh
98
+ bundle install
99
+ ```
100
+
101
+ - Run tests
102
+
103
+ ```sh
104
+ bundle exec rspec
105
+ ```
106
+
107
+ ### 2. Running your unpublished Plugin in Logstash
108
+
109
+ #### 2.1 Run in a local Logstash clone
110
+
111
+ - Edit Logstash `Gemfile` and add the local plugin path, for example:
112
+ ```ruby
113
+ gem "logstash-filter-awesome", :path => "/your/local/logstash-filter-awesome"
114
+ ```
115
+ - Install plugin
116
+ ```sh
117
+ bin/plugin install --no-verify
118
+ ```
119
+ - Run Logstash with your plugin
120
+ ```sh
121
+ bin/logstash -e 'filter {awesome {}}'
122
+ ```
123
+ At this point any modifications to the plugin code will be applied to this local Logstash setup. After modifying the plugin, simply rerun Logstash.
124
+
125
+ #### 2.2 Run in an installed Logstash
126
+
127
+ You can use the same **2.1** method to run your plugin in an installed Logstash by editing its `Gemfile` and pointing the `:path` to your local plugin development directory or you can build the gem and install it using:
128
+
129
+ - Build your plugin gem
130
+ ```sh
131
+ gem build logstash-filter-awesome.gemspec
132
+ ```
133
+ - Install the plugin from the Logstash home
134
+ ```sh
135
+ bin/plugin install /your/local/plugin/logstash-filter-awesome.gem
136
+ ```
137
+ - Start Logstash and proceed to test the plugin
138
+
139
+ ## Contributing
140
+
141
+ All contributions are welcome: ideas, patches, documentation, bug reports, complaints, and even something you drew up on a napkin.
142
+
143
+ Programming is not a required skill. Whatever you've seen about open source and maintainers or community members saying "send patches or die" - you will not see that here.
144
+
145
+ It is more important to the community that you are able to contribute.
146
+
147
+ For more information about contributing, see the [CONTRIBUTING](https://github.com/elastic/logstash/blob/master/CONTRIBUTING.md) file.
@@ -0,0 +1,37 @@
1
+ # CodecFactory:
2
+ # lazy-fetch codec plugins
3
+
4
+ class CodecFactory
5
+ def initialize(logger, options)
6
+ @logger = logger
7
+ @default_codec = options[:default_codec]
8
+ @codec_by_folder = options[:codec_by_folder]
9
+ @codecs = {
10
+ 'default' => @default_codec
11
+ }
12
+ end
13
+
14
+ def get_codec(record)
15
+ codec = find_codec(record)
16
+ if @codecs[codec].nil?
17
+ @codecs[codec] = get_codec_plugin(codec)
18
+ end
19
+ @logger.debug("Switching to codec #{codec}") if codec != 'default'
20
+ return @codecs[codec].clone
21
+ end
22
+
23
+ private
24
+
25
+ def find_codec(record)
26
+ bucket, key, folder = record[:bucket], record[:key], record[:folder]
27
+ unless @codec_by_folder[bucket].nil?
28
+ @logger.debug("Looking up codec for folder #{folder}", :codec => @codec_by_folder[bucket][folder])
29
+ return @codec_by_folder[bucket][folder] unless @codec_by_folder[bucket][folder].nil?
30
+ end
31
+ return 'default'
32
+ end
33
+
34
+ def get_codec_plugin(name, options = {})
35
+ LogStash::Plugin.lookup('codec', name).new(options)
36
+ end
37
+ end
@@ -0,0 +1,343 @@
1
+ # encoding: utf-8
2
+ require "logstash/inputs/threadable"
3
+ require "logstash/namespace"
4
+ require "logstash/timestamp"
5
+ require "logstash/plugin_mixins/aws_config"
6
+ require "logstash/shutdown_watcher"
7
+ require "logstash/errors"
8
+ require 'logstash/inputs/s3sqs/patch'
9
+ require "aws-sdk"
10
+
11
+ # "object-oriented interfaces on top of API clients"...
12
+ # => Overhead. FIXME: needed?
13
+ #require "aws-sdk-resources"
14
+ require "fileutils"
15
+ require "concurrent"
16
+ require 'tmpdir'
17
+ # unused in code:
18
+ #require "stud/interval"
19
+ #require "digest/md5"
20
+
21
+ require 'java'
22
+ java_import java.io.InputStream
23
+ java_import java.io.InputStreamReader
24
+ java_import java.io.FileInputStream
25
+ java_import java.io.BufferedReader
26
+ java_import java.util.zip.GZIPInputStream
27
+ java_import java.util.zip.ZipException
28
+ import java.lang.StringBuilder
29
+
30
+ # our helper classes
31
+ # these may go into this file for brevity...
32
+ require_relative 'sqs/poller'
33
+ require_relative 's3/client_factory'
34
+ require_relative 's3/downloader'
35
+ require_relative 'codec_factory'
36
+ require_relative 's3snssqs/log_processor'
37
+
38
+ Aws.eager_autoload!
39
+
40
+ # Get logs from AWS s3 buckets as issued by an object-created event via sqs.
41
+ #
42
+ # This plugin is based on the logstash-input-sqs plugin but doesn't log the sqs event itself.
43
+ # Instead it assumes, that the event is an s3 object-created event and will then download
44
+ # and process the given file.
45
+ #
46
+ # Some issues of logstash-input-sqs, like logstash not shutting down properly, have been
47
+ # fixed for this plugin.
48
+ #
49
+ # In contrast to logstash-input-sqs this plugin uses the "Receive Message Wait Time"
50
+ # configured for the sqs queue in question, a good value will be something like 10 seconds
51
+ # to ensure a reasonable shutdown time of logstash.
52
+ # Also use a "Default Visibility Timeout" that is high enough for log files to be downloaded
53
+ # and processed (I think a good value should be 5-10 minutes for most use cases), the plugin will
54
+ # avoid removing the event from the queue if the associated log file couldn't be correctly
55
+ # passed to the processing level of logstash (e.g. downloaded content size doesn't match sqs event).
56
+ #
57
+ # This plugin is meant for high availability setups, in contrast to logstash-input-s3 you can safely
58
+ # use multiple logstash nodes, since the usage of sqs will ensure that each logfile is processed
59
+ # only once and no file will get lost on node failure or downscaling for auto-scaling groups.
60
+ # (You should use a "Message Retention Period" >= 4 days for your sqs to ensure you can survive
61
+ # a weekend of faulty log file processing)
62
+ # The plugin will not delete objects from s3 buckets, so make sure to have a reasonable "Lifecycle"
63
+ # configured for your buckets, which should keep the files at least "Message Retention Period" days.
64
+ #
65
+ # A typical setup will contain some s3 buckets containing elb, cloudtrail or other log files.
66
+ # These will be configured to send object-created events to a sqs queue, which will be configured
67
+ # as the source queue for this plugin.
68
+ # (The plugin supports gzipped content if it is marked with "contend-encoding: gzip" as it is the
69
+ # case for cloudtrail logs)
70
+ #
71
+ # The logstash node therefore must have sqs permissions + the permissions to download objects
72
+ # from the s3 buckets that send events to the queue.
73
+ # (If logstash nodes are running on EC2 you should use a ServerRole to provide permissions)
74
+ # [source,json]
75
+ # {
76
+ # "Version": "2012-10-17",
77
+ # "Statement": [
78
+ # {
79
+ # "Effect": "Allow",
80
+ # "Action": [
81
+ # "sqs:Get*",
82
+ # "sqs:List*",
83
+ # "sqs:ReceiveMessage",
84
+ # "sqs:ChangeMessageVisibility*",
85
+ # "sqs:DeleteMessage*"
86
+ # ],
87
+ # "Resource": [
88
+ # "arn:aws:sqs:us-east-1:123456789012:my-elb-log-queue"
89
+ # ]
90
+ # },
91
+ # {
92
+ # "Effect": "Allow",
93
+ # "Action": [
94
+ # "s3:Get*",
95
+ # "s3:List*",
96
+ # "s3:DeleteObject"
97
+ # ],
98
+ # "Resource": [
99
+ # "arn:aws:s3:::my-elb-logs",
100
+ # "arn:aws:s3:::my-elb-logs/*"
101
+ # ]
102
+ # }
103
+ # ]
104
+ # }
105
+ #
106
+ class LogStash::Inputs::CrowdStrikeFDR < LogStash::Inputs::Threadable
107
+ include LogStash::PluginMixins::AwsConfig::V2
108
+ include LogProcessor
109
+
110
+
111
+ config_name "crowdstrike_fdr" # needs to match the name of this file
112
+
113
+ default :codec, "json" # FDR uses json, base project defaulted to "plain"
114
+
115
+ config :s3_key_prefix, :validate => :string, :default => '', :deprecated => true #, :obsolete => " Will be moved to s3_options_by_bucket/types"
116
+
117
+ config :s3_access_key_id, :validate => :string, :deprecated => true #, :obsolete => "Please migrate to :s3_options_by_bucket. We will remove this option in the next Version"
118
+ config :s3_secret_access_key, :validate => :string, :deprecated => true #, :obsolete => "Please migrate to :s3_options_by_bucket. We will remove this option in the next Version"
119
+ config :s3_role_arn, :validate => :string, :deprecated => true #, :obsolete => "Please migrate to :s3_options_by_bucket. We will remove this option in the next Version"
120
+
121
+ config :set_codec_by_folder, :validate => :hash, :default => {}, :deprecated => true #, :obsolete => "Please migrate to :s3_options_by_bucket. We will remove this option in the next Version"
122
+
123
+ # Default Options for the S3 clients
124
+ config :s3_default_options, :validate => :hash, :required => false, :default => {}
125
+ # We need a list of buckets, together with role arns and possible folder/codecs:
126
+ config :s3_options_by_bucket, :validate => :array, :required => false # TODO: true
127
+ # Session name to use when assuming an IAM role
128
+ config :s3_role_session_name, :validate => :string, :default => "logstash"
129
+ config :delete_on_success, :validate => :boolean, :default => false
130
+ # Whether or not to include the S3 object's properties (last_modified, content_type, metadata)
131
+ # into each Event at [@metadata][s3]. Regardless of this setting, [@metdata][s3][key] will always
132
+ # be present.
133
+ config :include_object_properties, :validate => :array, :default => [:last_modified, :content_type, :metadata]
134
+
135
+ ### sqs
136
+ # Name of the SQS Queue to pull messages from. Note that this is just the name of the queue, not the URL or ARN.
137
+ config :queue, :validate => :string, :required => true
138
+ config :queue_owner_aws_account_id, :validate => :string, :required => false
139
+
140
+ # CrowdStrike FDR does not use SNS so set the default to match
141
+ config :from_sns, :validate => :boolean, :default => false
142
+ config :sqs_skip_delete, :validate => :boolean, :default => false
143
+ config :sqs_wait_time_seconds, :validate => :number, :required => false
144
+ config :sqs_delete_on_failure, :validate => :boolean, :default => true
145
+
146
+ config :visibility_timeout, :validate => :number, :default => 120
147
+ config :max_processing_time, :validate => :number, :default => 8000
148
+ ### system
149
+ config :temporary_directory, :validate => :string, :default => File.join(Dir.tmpdir, "logstash")
150
+ # To run in multiple threads use this
151
+ config :consumer_threads, :validate => :number, :default => 1
152
+
153
+
154
+ public
155
+
156
+ # --- BEGIN plugin interface ----------------------------------------#
157
+
158
+ # initialisation
159
+ def register
160
+ # prepare system
161
+ FileUtils.mkdir_p(@temporary_directory) unless Dir.exist?(@temporary_directory)
162
+ @id ||= "Unknown" #Use INPUT{ id => name} for thread identifier
163
+ @credentials_by_bucket = hash_key_is_regex({})
164
+ @region_by_bucket = hash_key_is_regex({})
165
+ # create the bucket=>folder=>codec lookup from config options
166
+ @codec_by_folder = hash_key_is_regex({})
167
+ @type_by_folder = hash_key_is_regex({})
168
+
169
+ # use deprecated settings only if new config is missing:
170
+ if @s3_options_by_bucket.nil?
171
+ # We don't know any bucket name, so we must rely on a "catch-all" regex
172
+ s3_options = {
173
+ 'bucket_name' => '.*',
174
+ 'folders' => @set_codec_by_folder.map { |key, codec|
175
+ { 'key' => key, 'codec' => codec }
176
+ }
177
+ }
178
+ if @s3_role_arn.nil?
179
+ # access key/secret key pair needed
180
+ unless @s3_access_key_id.nil? or @s3_secret_access_key.nil?
181
+ s3_options['credentials'] = {
182
+ 'access_key_id' => @s3_access_key_id,
183
+ 'secret_access_key' => @s3_secret_access_key
184
+ }
185
+ end
186
+ else
187
+ s3_options['credentials'] = {
188
+ 'role' => @s3_role_arn
189
+ }
190
+ end
191
+ @s3_options_by_bucket = [s3_options]
192
+ end
193
+
194
+ @s3_options_by_bucket.each do |options|
195
+ bucket = options['bucket_name']
196
+ if options.key?('credentials')
197
+ @credentials_by_bucket[bucket] = options['credentials']
198
+ end
199
+ if options.key?('region')
200
+ @region_by_bucket[bucket] = options['region']
201
+ end
202
+ if options.key?('folders')
203
+ # make these hashes do key lookups using regex matching
204
+ folders = hash_key_is_regex({})
205
+ types = hash_key_is_regex({})
206
+ options['folders'].each do |entry|
207
+ @logger.debug("options for folder ", :folder => entry)
208
+ folders[entry['key']] = entry['codec'] if entry.key?('codec')
209
+ types[entry['key']] = entry['type'] if entry.key?('type')
210
+ end
211
+ @codec_by_folder[bucket] = folders unless folders.empty?
212
+ @type_by_folder[bucket] = types unless types.empty?
213
+ end
214
+ end
215
+
216
+ @received_stop = Concurrent::AtomicBoolean.new(false)
217
+
218
+ # instantiate helpers
219
+ @sqs_poller = SqsPoller.new(@logger, @received_stop,
220
+ {
221
+ visibility_timeout: @visibility_timeout,
222
+ skip_delete: @sqs_skip_delete,
223
+ wait_time_seconds: @sqs_wait_time_seconds
224
+ },
225
+ {
226
+ sqs_queue: @queue,
227
+ queue_owner_aws_account_id: @queue_owner_aws_account_id,
228
+ from_sns: @from_sns,
229
+ max_processing_time: @max_processing_time,
230
+ sqs_delete_on_failure: @sqs_delete_on_failure
231
+ },
232
+ aws_options_hash)
233
+ @s3_client_factory = S3ClientFactory.new(@logger, {
234
+ aws_region: @region,
235
+ s3_default_options: @s3_default_options,
236
+ s3_credentials_by_bucket: @credentials_by_bucket,
237
+ s3_region_by_bucket: @region_by_bucket,
238
+ s3_role_session_name: @s3_role_session_name
239
+ }, aws_options_hash)
240
+ @s3_downloader = S3Downloader.new(@logger, @received_stop, {
241
+ s3_client_factory: @s3_client_factory,
242
+ delete_on_success: @delete_on_success,
243
+ include_object_properties: @include_object_properties
244
+ })
245
+ @codec_factory = CodecFactory.new(@logger, {
246
+ default_codec: @codec,
247
+ codec_by_folder: @codec_by_folder
248
+ })
249
+ #@log_processor = LogProcessor.new(self)
250
+
251
+ # administrative stuff
252
+ @worker_threads = []
253
+ end
254
+
255
+ # startup
256
+ def run(logstash_event_queue)
257
+ @control_threads = @consumer_threads.times.map do |thread_id|
258
+ Thread.new do
259
+ restart_count = 0
260
+ while not stop?
261
+ #make thead start async to prevent polling the same message from sqs
262
+ sleep 0.5
263
+ worker_thread = run_worker_thread(logstash_event_queue, thread_id)
264
+ worker_thread.join
265
+ restart_count += 1
266
+ thread_id = "#{thread_id}_#{restart_count}"
267
+ @logger.info("[control_thread] restarting a thread #{thread_id}... ", :thread => worker_thread.inspect)
268
+ end
269
+ end
270
+ end
271
+ @control_threads.each { |t| t.join }
272
+ end
273
+
274
+ # shutdown
275
+ def stop
276
+ @received_stop.make_true
277
+
278
+ unless @worker_threads.nil?
279
+ @worker_threads.each do |worker|
280
+ begin
281
+ @logger.info("Stopping thread ... ", :thread => worker.inspect)
282
+ worker.wakeup
283
+ rescue
284
+ @logger.error("Cannot stop thread ... try to kill him", :thread => worker.inspect)
285
+ worker.kill
286
+ end
287
+ end
288
+ end
289
+ end
290
+
291
+ def stop?
292
+ @received_stop.value
293
+ end
294
+
295
+ # --- END plugin interface ------------------------------------------#
296
+
297
+ private
298
+ def run_worker_thread(queue, thread_id)
299
+ Thread.new do
300
+ LogStash::Util.set_thread_name("Worker #{@id}/#{thread_id}")
301
+ @logger.info("[#{Thread.current[:name]}] started (#{Time.now})") #PROFILING
302
+ temporary_directory = Dir.mktmpdir("#{@temporary_directory}/")
303
+ @sqs_poller.run do |record|
304
+ throw :skip_delete if stop?
305
+ # record is a valid object with the keys ":bucket", ":key", ":size"
306
+ record[:local_file] = File.join(temporary_directory, File.basename(record[:key]))
307
+ if @s3_downloader.copy_s3object_to_disk(record)
308
+ completed = catch(:skip_delete) do
309
+ process(record, queue)
310
+ end
311
+ @s3_downloader.cleanup_local_object(record)
312
+ # re-throw if necessary:
313
+ throw :skip_delete unless completed
314
+ @s3_downloader.cleanup_s3object(record)
315
+ end
316
+ end
317
+ end
318
+ end
319
+
320
+ # Will be removed in further releases:
321
+ def get_object_folder(key)
322
+ if match = /#{s3_key_prefix}\/?(?<type_folder>.*?)\/.*/.match(key)
323
+ return match['type_folder']
324
+ else
325
+ return ""
326
+ end
327
+ end
328
+
329
+ def hash_key_is_regex(myhash)
330
+ myhash.default_proc = lambda do |hash, lookup|
331
+ result = nil
332
+ hash.each_pair do |key, value|
333
+ if %r[#{key}] =~ lookup
334
+ result = value
335
+ break
336
+ end
337
+ end
338
+ result
339
+ end
340
+ # return input hash (convenience)
341
+ return myhash
342
+ end
343
+ end # class