logstash-input-crowdstrike_fdr 2.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: c8e0b47a277210807baaf165fa9a0b1acd18fa52
4
+ data.tar.gz: 553647a85d8c20fd05c74bf530fee9df2b7d053f
5
+ SHA512:
6
+ metadata.gz: 726387033c60271642d500faa8ec4cbf0d039a71413036615bc09bcb64582966b4bdddee342df2c6c8703822eacba5751ed10028c48fada1eacfb2337de142ab
7
+ data.tar.gz: e47d188d9294cfb83c2c28306cd63cddbf03351a8c918826701ada77377271dfb1729f6468ae1db4978e7ef72a185a8571d680b4871126e02c2bc74482074cec
data/CHANGELOG.md ADDED
@@ -0,0 +1,141 @@
1
+ ##2.1.2
2
+ - FEATURE: Now it´s possible to use queue urls and names.
3
+ - FEATURE: Add sqs long polling config parameter: sqs_wait_time_seconds
4
+ - FIX: Valid UTF-8 byte sequences in logs are munged
5
+ - CLEANUP: Remove tests. (as a begin for clean testing)
6
+ ##2.1.1
7
+ - FEATURE: Enable Multiregion Support for included S3 client.
8
+ - Add region by bucket feature
9
+ ##2.1.0
10
+ - FEATURE: Add S3 metadata -> config :include_object_properties
11
+ - FEATURE: Watch for threads in exception state and restart...
12
+ ##2.0.9
13
+ -gzip dectection should return false for files smaller than gzip_signiture_bytes
14
+ ##2.0.8
15
+ -fix nil class error
16
+ ##2.0.7
17
+ -fix gem error
18
+ ##2.0.6
19
+ -fix: fix crash of extender
20
+ ##2.0.5
21
+ -fix: crash on 0 byte file
22
+ -fix: type by folder function
23
+ ##2.0.4
24
+ -fix: type-by-folder repair
25
+ -fix: crash on 0 byte file
26
+ ##2.0.3
27
+ - Increase max parsing time -> drop event if reached
28
+ - watcher thread should raise an error in poller loop if timeout reached.
29
+ - remove some debug logs
30
+
31
+ ##2.0.2
32
+ FIX:
33
+ - Terminate every input line by \n (BufferedReader does not)
34
+ - Wrong input for type folder, leads to empty types
35
+ ##2.0.1
36
+ FIX:
37
+ - Deadlock while message decoding
38
+ - make method stop? public
39
+
40
+ ##2.0.0
41
+ Breaking Changes:
42
+ - s3_key_prefix was never functional and will be removed. Actually only used for metadata.folder backward compatibility.
43
+ config for s3 paths are regex (if not exact match)
44
+ - s3_options_by_bucket substitutes all s3_* options
45
+ We will merge deprecated options into the new structure for one release
46
+ Changes:
47
+ - Refactor plugin structure to be more modular
48
+ - Rework threadding design
49
+ - introduce s3_options_by_bucket to configure settings (e.g aws_options_hash or type)
50
+ ##1.6.1
51
+ - Fix typo in gzip error logging
52
+ ##1.6.0
53
+ - add a test to tmp file deletion
54
+ - revert type folder regex
55
+ ##1.5.9
56
+ - Fix regex for type folder
57
+ ##1.5.8
58
+ - Add some debug and a toggle for delete
59
+ ##1.5.7
60
+ - Remove Debug poutput
61
+ ##1.5.6
62
+ -BugFix
63
+ ##1.5.5
64
+ - Memo to me: better testing :-) Fix msg -> message
65
+ ##1.5.4
66
+ - BugFix
67
+ ##1.5.3
68
+ - Try to fix requeue Problem
69
+ ## 1.5.2
70
+ - BugFix Set Metadata bucket,key,folder
71
+ - Feature: Possibility to fall back to old threadding model by unset consumer_threads
72
+ ## 1.5.1
73
+ - BugFix: rescue all AWS::S3 errors
74
+ ## 1.5.0
75
+ - Feature: Use own magicbyte detector (small&fast)
76
+ ## 1.4.9
77
+ - Feature: Detect filetype with MagicByte
78
+ ## 1.4.8
79
+ - Bufix: CF Metadata events Bug #7
80
+ - Feature: use aws-role for s3 client connection.
81
+ ## 1.4.7
82
+ Remove from rubygems.org
83
+ ## 1.4.6
84
+ - BugFix: jRuby > 2 : No return from block
85
+ - BugFix: No exit on gzip error
86
+ ## 1.4.5
87
+ - BugFix: undeclared var in rescue
88
+ ## 1.4.4
89
+ - Feature: make set_codec_by_folder match as regex
90
+ e.g.: set_codec_by_folder => { ".*-ELB-logs" => "plain"}
91
+ ## 1.4.3
92
+ - Fix: skip_delete on S3::Errors::AccessDenied
93
+ - Feature: codec per s3 folder
94
+ - Feature: Alpha phase: different credentials for s3 / default credentials for sqs
95
+ - Feature: Find files folder.
96
+ ## 1.4.2
97
+ - Fix: Thread shutdown method should kill in case of wakeup fails
98
+ ## 1.4.1
99
+ - Fix: Last event in file not decorated
100
+ - Adjust metadata namings
101
+ - Event decoration in private method now.
102
+ ## 1.4.0
103
+ - Filehandling rewritten THX to logstash-input-s3 for inspiration
104
+ - Improve performance of gzip decoding by 10x by using Java's Zlib
105
+ - Added multithreading via config Use: consumer_threads in config
106
+ ## 1.2.0
107
+ - Add codec suggestion by content-type
108
+ - enrich metadata
109
+ - Fix some bugs
110
+ ## 1.1.9
111
+ - Add config for s3 folder prefix, auto codec and auto type
112
+ ## 1.1.8
113
+ - Add config switch for delivery with or without SNS
114
+ ## 1.1.6
115
+ - Fix a nil exception in message parsing
116
+ ## 1.1.5
117
+ - Fix loglevel for some debug messages
118
+ ## 1.1.4
119
+ - Add Account-ID to config
120
+ ## 1.1.2
121
+ - Fix a Bug in the S3 Key generation
122
+ - Enable shipping throug SNS Topic (needs another toJSON)
123
+ ## 1.1.1
124
+ - Added the ability to remove objects from S3 after processing.
125
+ - Workaround an issue with the Ruby autoload that causes "uninitialized constant `Aws::Client::Errors`" errors.
126
+
127
+ ## 1.1.0
128
+ - Logstash 5 compatibility
129
+
130
+ ## 1.0.3
131
+ - added some metadata to the event (bucket and object name as commited by joshuaspence)
132
+ - also try to unzip files ending with ".gz" (ALB logs are zipped but not marked with proper Content-Encoding)
133
+
134
+ ## 1.0.2
135
+ - fix for broken UTF-8 (so we won't lose a whole s3 log file because of a single invalid line, ruby's split will die on those)
136
+
137
+ ## 1.0.1
138
+ - same (because of screwed up rubygems.org release)
139
+
140
+ ## 1.0.0
141
+ - Initial Release
data/CONTRIBUTORS ADDED
@@ -0,0 +1,14 @@
1
+ The following is a list of people who have contributed ideas, code, bug
2
+ reports, or in general have helped logstash along its way.
3
+
4
+ Contributors:
5
+ * cherweg(this fork + some bugfixes)
6
+ * holgerjenczewski1007 (thank you for the refactoring)
7
+ * joshuaspence (event metadata)
8
+ * Heiko-san (initial contributor)
9
+ * logstash-input-sqs plugin as code base
10
+
11
+ Note: If you've sent us patches, bug reports, or otherwise contributed to
12
+ Logstash, and you aren't on the list above and want to be, please let us know
13
+ and we'll make sure you're here. Contributions from folks like you are what make
14
+ open source awesome.
data/Gemfile ADDED
@@ -0,0 +1,11 @@
1
+ source 'https://rubygems.org'
2
+
3
+ gemspec
4
+
5
+ logstash_path = ENV["LOGSTASH_PATH"] || "../../logstash"
6
+ use_logstash_source = ENV["LOGSTASH_SOURCE"] && ENV["LOGSTASH_SOURCE"].to_s == "1"
7
+
8
+ if Dir.exist?(logstash_path) && use_logstash_source
9
+ gem 'logstash-core', :path => "#{logstash_path}/logstash-core"
10
+ gem 'logstash-core-plugin-api', :path => "#{logstash_path}/logstash-core-plugin-api"
11
+ end
data/LICENSE ADDED
@@ -0,0 +1,13 @@
1
+ Copyright (c) 2012–2015 Elasticsearch <http://www.elastic.co>
2
+
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+
7
+ http://www.apache.org/licenses/LICENSE-2.0
8
+
9
+ Unless required by applicable law or agreed to in writing, software
10
+ distributed under the License is distributed on an "AS IS" BASIS,
11
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ See the License for the specific language governing permissions and
13
+ limitations under the License.
data/NOTICE.TXT ADDED
@@ -0,0 +1,5 @@
1
+ Elasticsearch
2
+ Copyright 2012-2015 Elasticsearch
3
+
4
+ This product includes software developed by The Apache Software
5
+ Foundation (http://www.apache.org/).
data/README.md ADDED
@@ -0,0 +1,147 @@
1
+ # Logstash Plugin
2
+
3
+ This is a plugin for [Logstash](https://github.com/elastic/logstash).
4
+
5
+ It is fully free and fully open source. The license is Apache 2.0.
6
+
7
+ ## Documentation
8
+
9
+ Get logs from AWS s3 buckets as issued by an object-created event via sqs.
10
+
11
+ This plugin is based on the logstash-input-sqs plugin but doesn't log the sqs event itself.
12
+ Instead it assumes, that the event is an s3 object-created event and will then download
13
+ and process the given file.
14
+
15
+ Some issues of logstash-input-sqs, like logstash not shutting down properly, have been
16
+ fixed for this plugin.
17
+
18
+ In contrast to logstash-input-sqs this plugin uses the "Receive Message Wait Time"
19
+ configured for the sqs queue in question, a good value will be something like 10 seconds
20
+ to ensure a reasonable shutdown time of logstash.
21
+ Also use a "Default Visibility Timeout" that is high enough for log files to be downloaded
22
+ and processed (I think a good value should be 5-10 minutes for most use cases), the plugin will
23
+ avoid removing the event from the queue if the associated log file couldn't be correctly
24
+ passed to the processing level of logstash (e.g. downloaded content size doesn't match sqs event).
25
+
26
+ This plugin is meant for high availability setups, in contrast to logstash-input-s3 you can safely
27
+ use multiple logstash nodes, since the usage of sqs will ensure that each logfile is processed
28
+ only once and no file will get lost on node failure or downscaling for auto-scaling groups.
29
+ (You should use a "Message Retention Period" >= 4 days for your sqs to ensure you can survive
30
+ a weekend of faulty log file processing)
31
+ The plugin will not delete objects from s3 buckets, so make sure to have a reasonable "Lifecycle"
32
+ configured for your buckets, which should keep the files at least "Message Retention Period" days.
33
+
34
+ A typical setup will contain some s3 buckets containing elb, cloudtrail or other log files.
35
+ These will be configured to send object-created events to a sqs queue, which will be configured
36
+ as the source queue for this plugin.
37
+ (The plugin supports gzipped content if it is marked with "contend-encoding: gzip" as it is the
38
+ case for cloudtrail logs)
39
+
40
+ The logstash node therefore must have sqs permissions + the permissions to download objects
41
+ from the s3 buckets that send events to the queue.
42
+ (If logstash nodes are running on EC2 you should use a ServerRole to provide permissions)
43
+ [source,json]
44
+ {
45
+ "Version": "2012-10-17",
46
+ "Statement": [
47
+ {
48
+ "Effect": "Allow",
49
+ "Action": [
50
+ "sqs:Get*",
51
+ "sqs:List*",
52
+ "sqs:ReceiveMessage",
53
+ "sqs:ChangeMessageVisibility*",
54
+ "sqs:DeleteMessage*"
55
+ ],
56
+ "Resource": [
57
+ "arn:aws:sqs:us-east-1:123456789012:my-elb-log-queue"
58
+ ]
59
+ },
60
+ {
61
+ "Effect": "Allow",
62
+ "Action": [
63
+ "s3:Get*",
64
+ "s3:List*",
65
+ "s3:DeleteObject"
66
+ ],
67
+ "Resource": [
68
+ "arn:aws:s3:::my-elb-logs",
69
+ "arn:aws:s3:::my-elb-logs/*"
70
+ ]
71
+ }
72
+ ]
73
+ }
74
+
75
+ ## Need Help?
76
+
77
+ Need help? Try #logstash on freenode IRC or the https://discuss.elastic.co/c/logstash discussion forum.
78
+
79
+ ## Developing
80
+
81
+ ### 1. Plugin Developement and Testing
82
+
83
+ #### Code
84
+ - To get started, you'll need JRuby with the Bundler gem installed.
85
+
86
+ - Create a new plugin or clone and existing from the GitHub [logstash-plugins](https://github.com/logstash-plugins) organization. We also provide [example plugins](https://github.com/logstash-plugins?query=example).
87
+
88
+ - Install dependencies
89
+ ```sh
90
+ bundle install
91
+ ```
92
+
93
+ #### Test
94
+
95
+ - Update your dependencies
96
+
97
+ ```sh
98
+ bundle install
99
+ ```
100
+
101
+ - Run tests
102
+
103
+ ```sh
104
+ bundle exec rspec
105
+ ```
106
+
107
+ ### 2. Running your unpublished Plugin in Logstash
108
+
109
+ #### 2.1 Run in a local Logstash clone
110
+
111
+ - Edit Logstash `Gemfile` and add the local plugin path, for example:
112
+ ```ruby
113
+ gem "logstash-filter-awesome", :path => "/your/local/logstash-filter-awesome"
114
+ ```
115
+ - Install plugin
116
+ ```sh
117
+ bin/plugin install --no-verify
118
+ ```
119
+ - Run Logstash with your plugin
120
+ ```sh
121
+ bin/logstash -e 'filter {awesome {}}'
122
+ ```
123
+ At this point any modifications to the plugin code will be applied to this local Logstash setup. After modifying the plugin, simply rerun Logstash.
124
+
125
+ #### 2.2 Run in an installed Logstash
126
+
127
+ You can use the same **2.1** method to run your plugin in an installed Logstash by editing its `Gemfile` and pointing the `:path` to your local plugin development directory or you can build the gem and install it using:
128
+
129
+ - Build your plugin gem
130
+ ```sh
131
+ gem build logstash-filter-awesome.gemspec
132
+ ```
133
+ - Install the plugin from the Logstash home
134
+ ```sh
135
+ bin/plugin install /your/local/plugin/logstash-filter-awesome.gem
136
+ ```
137
+ - Start Logstash and proceed to test the plugin
138
+
139
+ ## Contributing
140
+
141
+ All contributions are welcome: ideas, patches, documentation, bug reports, complaints, and even something you drew up on a napkin.
142
+
143
+ Programming is not a required skill. Whatever you've seen about open source and maintainers or community members saying "send patches or die" - you will not see that here.
144
+
145
+ It is more important to the community that you are able to contribute.
146
+
147
+ For more information about contributing, see the [CONTRIBUTING](https://github.com/elastic/logstash/blob/master/CONTRIBUTING.md) file.
@@ -0,0 +1,37 @@
1
+ # CodecFactory:
2
+ # lazy-fetch codec plugins
3
+
4
+ class CodecFactory
5
+ def initialize(logger, options)
6
+ @logger = logger
7
+ @default_codec = options[:default_codec]
8
+ @codec_by_folder = options[:codec_by_folder]
9
+ @codecs = {
10
+ 'default' => @default_codec
11
+ }
12
+ end
13
+
14
+ def get_codec(record)
15
+ codec = find_codec(record)
16
+ if @codecs[codec].nil?
17
+ @codecs[codec] = get_codec_plugin(codec)
18
+ end
19
+ @logger.debug("Switching to codec #{codec}") if codec != 'default'
20
+ return @codecs[codec].clone
21
+ end
22
+
23
+ private
24
+
25
+ def find_codec(record)
26
+ bucket, key, folder = record[:bucket], record[:key], record[:folder]
27
+ unless @codec_by_folder[bucket].nil?
28
+ @logger.debug("Looking up codec for folder #{folder}", :codec => @codec_by_folder[bucket][folder])
29
+ return @codec_by_folder[bucket][folder] unless @codec_by_folder[bucket][folder].nil?
30
+ end
31
+ return 'default'
32
+ end
33
+
34
+ def get_codec_plugin(name, options = {})
35
+ LogStash::Plugin.lookup('codec', name).new(options)
36
+ end
37
+ end
@@ -0,0 +1,343 @@
1
+ # encoding: utf-8
2
+ require "logstash/inputs/threadable"
3
+ require "logstash/namespace"
4
+ require "logstash/timestamp"
5
+ require "logstash/plugin_mixins/aws_config"
6
+ require "logstash/shutdown_watcher"
7
+ require "logstash/errors"
8
+ require 'logstash/inputs/s3sqs/patch'
9
+ require "aws-sdk"
10
+
11
+ # "object-oriented interfaces on top of API clients"...
12
+ # => Overhead. FIXME: needed?
13
+ #require "aws-sdk-resources"
14
+ require "fileutils"
15
+ require "concurrent"
16
+ require 'tmpdir'
17
+ # unused in code:
18
+ #require "stud/interval"
19
+ #require "digest/md5"
20
+
21
+ require 'java'
22
+ java_import java.io.InputStream
23
+ java_import java.io.InputStreamReader
24
+ java_import java.io.FileInputStream
25
+ java_import java.io.BufferedReader
26
+ java_import java.util.zip.GZIPInputStream
27
+ java_import java.util.zip.ZipException
28
+ import java.lang.StringBuilder
29
+
30
+ # our helper classes
31
+ # these may go into this file for brevity...
32
+ require_relative 'sqs/poller'
33
+ require_relative 's3/client_factory'
34
+ require_relative 's3/downloader'
35
+ require_relative 'codec_factory'
36
+ require_relative 's3snssqs/log_processor'
37
+
38
+ Aws.eager_autoload!
39
+
40
+ # Get logs from AWS s3 buckets as issued by an object-created event via sqs.
41
+ #
42
+ # This plugin is based on the logstash-input-sqs plugin but doesn't log the sqs event itself.
43
+ # Instead it assumes, that the event is an s3 object-created event and will then download
44
+ # and process the given file.
45
+ #
46
+ # Some issues of logstash-input-sqs, like logstash not shutting down properly, have been
47
+ # fixed for this plugin.
48
+ #
49
+ # In contrast to logstash-input-sqs this plugin uses the "Receive Message Wait Time"
50
+ # configured for the sqs queue in question, a good value will be something like 10 seconds
51
+ # to ensure a reasonable shutdown time of logstash.
52
+ # Also use a "Default Visibility Timeout" that is high enough for log files to be downloaded
53
+ # and processed (I think a good value should be 5-10 minutes for most use cases), the plugin will
54
+ # avoid removing the event from the queue if the associated log file couldn't be correctly
55
+ # passed to the processing level of logstash (e.g. downloaded content size doesn't match sqs event).
56
+ #
57
+ # This plugin is meant for high availability setups, in contrast to logstash-input-s3 you can safely
58
+ # use multiple logstash nodes, since the usage of sqs will ensure that each logfile is processed
59
+ # only once and no file will get lost on node failure or downscaling for auto-scaling groups.
60
+ # (You should use a "Message Retention Period" >= 4 days for your sqs to ensure you can survive
61
+ # a weekend of faulty log file processing)
62
+ # The plugin will not delete objects from s3 buckets, so make sure to have a reasonable "Lifecycle"
63
+ # configured for your buckets, which should keep the files at least "Message Retention Period" days.
64
+ #
65
+ # A typical setup will contain some s3 buckets containing elb, cloudtrail or other log files.
66
+ # These will be configured to send object-created events to a sqs queue, which will be configured
67
+ # as the source queue for this plugin.
68
+ # (The plugin supports gzipped content if it is marked with "contend-encoding: gzip" as it is the
69
+ # case for cloudtrail logs)
70
+ #
71
+ # The logstash node therefore must have sqs permissions + the permissions to download objects
72
+ # from the s3 buckets that send events to the queue.
73
+ # (If logstash nodes are running on EC2 you should use a ServerRole to provide permissions)
74
+ # [source,json]
75
+ # {
76
+ # "Version": "2012-10-17",
77
+ # "Statement": [
78
+ # {
79
+ # "Effect": "Allow",
80
+ # "Action": [
81
+ # "sqs:Get*",
82
+ # "sqs:List*",
83
+ # "sqs:ReceiveMessage",
84
+ # "sqs:ChangeMessageVisibility*",
85
+ # "sqs:DeleteMessage*"
86
+ # ],
87
+ # "Resource": [
88
+ # "arn:aws:sqs:us-east-1:123456789012:my-elb-log-queue"
89
+ # ]
90
+ # },
91
+ # {
92
+ # "Effect": "Allow",
93
+ # "Action": [
94
+ # "s3:Get*",
95
+ # "s3:List*",
96
+ # "s3:DeleteObject"
97
+ # ],
98
+ # "Resource": [
99
+ # "arn:aws:s3:::my-elb-logs",
100
+ # "arn:aws:s3:::my-elb-logs/*"
101
+ # ]
102
+ # }
103
+ # ]
104
+ # }
105
+ #
106
+ class LogStash::Inputs::CrowdStrikeFDR < LogStash::Inputs::Threadable
107
+ include LogStash::PluginMixins::AwsConfig::V2
108
+ include LogProcessor
109
+
110
+
111
+ config_name "crowdstrike_fdr" # needs to match the name of this file
112
+
113
+ default :codec, "json" # FDR uses json, base project defaulted to "plain"
114
+
115
+ config :s3_key_prefix, :validate => :string, :default => '', :deprecated => true #, :obsolete => " Will be moved to s3_options_by_bucket/types"
116
+
117
+ config :s3_access_key_id, :validate => :string, :deprecated => true #, :obsolete => "Please migrate to :s3_options_by_bucket. We will remove this option in the next Version"
118
+ config :s3_secret_access_key, :validate => :string, :deprecated => true #, :obsolete => "Please migrate to :s3_options_by_bucket. We will remove this option in the next Version"
119
+ config :s3_role_arn, :validate => :string, :deprecated => true #, :obsolete => "Please migrate to :s3_options_by_bucket. We will remove this option in the next Version"
120
+
121
+ config :set_codec_by_folder, :validate => :hash, :default => {}, :deprecated => true #, :obsolete => "Please migrate to :s3_options_by_bucket. We will remove this option in the next Version"
122
+
123
+ # Default Options for the S3 clients
124
+ config :s3_default_options, :validate => :hash, :required => false, :default => {}
125
+ # We need a list of buckets, together with role arns and possible folder/codecs:
126
+ config :s3_options_by_bucket, :validate => :array, :required => false # TODO: true
127
+ # Session name to use when assuming an IAM role
128
+ config :s3_role_session_name, :validate => :string, :default => "logstash"
129
+ config :delete_on_success, :validate => :boolean, :default => false
130
+ # Whether or not to include the S3 object's properties (last_modified, content_type, metadata)
131
+ # into each Event at [@metadata][s3]. Regardless of this setting, [@metdata][s3][key] will always
132
+ # be present.
133
+ config :include_object_properties, :validate => :array, :default => [:last_modified, :content_type, :metadata]
134
+
135
+ ### sqs
136
+ # Name of the SQS Queue to pull messages from. Note that this is just the name of the queue, not the URL or ARN.
137
+ config :queue, :validate => :string, :required => true
138
+ config :queue_owner_aws_account_id, :validate => :string, :required => false
139
+
140
+ # CrowdStrike FDR does not use SNS so set the default to match
141
+ config :from_sns, :validate => :boolean, :default => false
142
+ config :sqs_skip_delete, :validate => :boolean, :default => false
143
+ config :sqs_wait_time_seconds, :validate => :number, :required => false
144
+ config :sqs_delete_on_failure, :validate => :boolean, :default => true
145
+
146
+ config :visibility_timeout, :validate => :number, :default => 120
147
+ config :max_processing_time, :validate => :number, :default => 8000
148
+ ### system
149
+ config :temporary_directory, :validate => :string, :default => File.join(Dir.tmpdir, "logstash")
150
+ # To run in multiple threads use this
151
+ config :consumer_threads, :validate => :number, :default => 1
152
+
153
+
154
+ public
155
+
156
+ # --- BEGIN plugin interface ----------------------------------------#
157
+
158
+ # initialisation
159
+ def register
160
+ # prepare system
161
+ FileUtils.mkdir_p(@temporary_directory) unless Dir.exist?(@temporary_directory)
162
+ @id ||= "Unknown" #Use INPUT{ id => name} for thread identifier
163
+ @credentials_by_bucket = hash_key_is_regex({})
164
+ @region_by_bucket = hash_key_is_regex({})
165
+ # create the bucket=>folder=>codec lookup from config options
166
+ @codec_by_folder = hash_key_is_regex({})
167
+ @type_by_folder = hash_key_is_regex({})
168
+
169
+ # use deprecated settings only if new config is missing:
170
+ if @s3_options_by_bucket.nil?
171
+ # We don't know any bucket name, so we must rely on a "catch-all" regex
172
+ s3_options = {
173
+ 'bucket_name' => '.*',
174
+ 'folders' => @set_codec_by_folder.map { |key, codec|
175
+ { 'key' => key, 'codec' => codec }
176
+ }
177
+ }
178
+ if @s3_role_arn.nil?
179
+ # access key/secret key pair needed
180
+ unless @s3_access_key_id.nil? or @s3_secret_access_key.nil?
181
+ s3_options['credentials'] = {
182
+ 'access_key_id' => @s3_access_key_id,
183
+ 'secret_access_key' => @s3_secret_access_key
184
+ }
185
+ end
186
+ else
187
+ s3_options['credentials'] = {
188
+ 'role' => @s3_role_arn
189
+ }
190
+ end
191
+ @s3_options_by_bucket = [s3_options]
192
+ end
193
+
194
+ @s3_options_by_bucket.each do |options|
195
+ bucket = options['bucket_name']
196
+ if options.key?('credentials')
197
+ @credentials_by_bucket[bucket] = options['credentials']
198
+ end
199
+ if options.key?('region')
200
+ @region_by_bucket[bucket] = options['region']
201
+ end
202
+ if options.key?('folders')
203
+ # make these hashes do key lookups using regex matching
204
+ folders = hash_key_is_regex({})
205
+ types = hash_key_is_regex({})
206
+ options['folders'].each do |entry|
207
+ @logger.debug("options for folder ", :folder => entry)
208
+ folders[entry['key']] = entry['codec'] if entry.key?('codec')
209
+ types[entry['key']] = entry['type'] if entry.key?('type')
210
+ end
211
+ @codec_by_folder[bucket] = folders unless folders.empty?
212
+ @type_by_folder[bucket] = types unless types.empty?
213
+ end
214
+ end
215
+
216
+ @received_stop = Concurrent::AtomicBoolean.new(false)
217
+
218
+ # instantiate helpers
219
+ @sqs_poller = SqsPoller.new(@logger, @received_stop,
220
+ {
221
+ visibility_timeout: @visibility_timeout,
222
+ skip_delete: @sqs_skip_delete,
223
+ wait_time_seconds: @sqs_wait_time_seconds
224
+ },
225
+ {
226
+ sqs_queue: @queue,
227
+ queue_owner_aws_account_id: @queue_owner_aws_account_id,
228
+ from_sns: @from_sns,
229
+ max_processing_time: @max_processing_time,
230
+ sqs_delete_on_failure: @sqs_delete_on_failure
231
+ },
232
+ aws_options_hash)
233
+ @s3_client_factory = S3ClientFactory.new(@logger, {
234
+ aws_region: @region,
235
+ s3_default_options: @s3_default_options,
236
+ s3_credentials_by_bucket: @credentials_by_bucket,
237
+ s3_region_by_bucket: @region_by_bucket,
238
+ s3_role_session_name: @s3_role_session_name
239
+ }, aws_options_hash)
240
+ @s3_downloader = S3Downloader.new(@logger, @received_stop, {
241
+ s3_client_factory: @s3_client_factory,
242
+ delete_on_success: @delete_on_success,
243
+ include_object_properties: @include_object_properties
244
+ })
245
+ @codec_factory = CodecFactory.new(@logger, {
246
+ default_codec: @codec,
247
+ codec_by_folder: @codec_by_folder
248
+ })
249
+ #@log_processor = LogProcessor.new(self)
250
+
251
+ # administrative stuff
252
+ @worker_threads = []
253
+ end
254
+
255
+ # startup
256
+ def run(logstash_event_queue)
257
+ @control_threads = @consumer_threads.times.map do |thread_id|
258
+ Thread.new do
259
+ restart_count = 0
260
+ while not stop?
261
+ #make thead start async to prevent polling the same message from sqs
262
+ sleep 0.5
263
+ worker_thread = run_worker_thread(logstash_event_queue, thread_id)
264
+ worker_thread.join
265
+ restart_count += 1
266
+ thread_id = "#{thread_id}_#{restart_count}"
267
+ @logger.info("[control_thread] restarting a thread #{thread_id}... ", :thread => worker_thread.inspect)
268
+ end
269
+ end
270
+ end
271
+ @control_threads.each { |t| t.join }
272
+ end
273
+
274
+ # shutdown
275
+ def stop
276
+ @received_stop.make_true
277
+
278
+ unless @worker_threads.nil?
279
+ @worker_threads.each do |worker|
280
+ begin
281
+ @logger.info("Stopping thread ... ", :thread => worker.inspect)
282
+ worker.wakeup
283
+ rescue
284
+ @logger.error("Cannot stop thread ... try to kill him", :thread => worker.inspect)
285
+ worker.kill
286
+ end
287
+ end
288
+ end
289
+ end
290
+
291
+ def stop?
292
+ @received_stop.value
293
+ end
294
+
295
+ # --- END plugin interface ------------------------------------------#
296
+
297
+ private
298
+ def run_worker_thread(queue, thread_id)
299
+ Thread.new do
300
+ LogStash::Util.set_thread_name("Worker #{@id}/#{thread_id}")
301
+ @logger.info("[#{Thread.current[:name]}] started (#{Time.now})") #PROFILING
302
+ temporary_directory = Dir.mktmpdir("#{@temporary_directory}/")
303
+ @sqs_poller.run do |record|
304
+ throw :skip_delete if stop?
305
+ # record is a valid object with the keys ":bucket", ":key", ":size"
306
+ record[:local_file] = File.join(temporary_directory, File.basename(record[:key]))
307
+ if @s3_downloader.copy_s3object_to_disk(record)
308
+ completed = catch(:skip_delete) do
309
+ process(record, queue)
310
+ end
311
+ @s3_downloader.cleanup_local_object(record)
312
+ # re-throw if necessary:
313
+ throw :skip_delete unless completed
314
+ @s3_downloader.cleanup_s3object(record)
315
+ end
316
+ end
317
+ end
318
+ end
319
+
320
+ # Will be removed in further releases:
321
+ def get_object_folder(key)
322
+ if match = /#{s3_key_prefix}\/?(?<type_folder>.*?)\/.*/.match(key)
323
+ return match['type_folder']
324
+ else
325
+ return ""
326
+ end
327
+ end
328
+
329
+ def hash_key_is_regex(myhash)
330
+ myhash.default_proc = lambda do |hash, lookup|
331
+ result = nil
332
+ hash.each_pair do |key, value|
333
+ if %r[#{key}] =~ lookup
334
+ result = value
335
+ break
336
+ end
337
+ end
338
+ result
339
+ end
340
+ # return input hash (convenience)
341
+ return myhash
342
+ end
343
+ end # class