logstash-input-s3 3.5.0 → 3.8.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a6d9ab89a4d7925dbaaa02b021b1bbe803426a5c2e5285c1239d72950563fc27
4
- data.tar.gz: 40aafdb8002e940fcc08f72d119299567dda77210dedcaf436df9a273858ecf1
3
+ metadata.gz: dd11c4e3d6df5c28143a4a67de1dcaddb7f3d9bd7a0921c278d15482634bc676
4
+ data.tar.gz: c9687036e353d7a8a71e4d4ec85e497a110ea740b22b6b98173350bcafe8f5b4
5
5
  SHA512:
6
- metadata.gz: 12730fa07325e2549ac32c8b7a629464c3a1d789992c5e07bd3f1bf43ad11b353886882678d00c27a14a7e0e675eaaa4002187f5141d078308de8e3e480d67d3
7
- data.tar.gz: 24c61bb4d995ef2615cf9ea769ad36d88965eb758e8a9d205394a887ca6f1cac47211e76c42c81e8499f1ab818e072ee50e52431de045e1018402af947f61df3
6
+ metadata.gz: 3e552d9501b11be011db52415f446a00fc7bfe2ecb601a50f3fa5938de1931fadaded204e46626fd530b6779802627e92f7fe56cda6db5b7f88cd5efadfec2a7
7
+ data.tar.gz: ed566e40ac8bb1acb29751b222453fa01a315b40a0d3778d23cc024db3d303fd8baf09d287b91f1492d701c007d972032003eb04f2a45c11ab0fe07dff21c77b
data/CHANGELOG.md CHANGED
@@ -1,3 +1,22 @@
1
+ ## 3.8.1
2
+ - Feat: cast true/false values for additional_settings [#232](https://github.com/logstash-plugins/logstash-input-s3/pull/232)
3
+
4
+ ## 3.8.0
5
+ - Add ECS v8 support.
6
+
7
+ ## 3.7.0
8
+ - Add ECS support. [#228](https://github.com/logstash-plugins/logstash-input-s3/pull/228)
9
+ - Fix missing file in cutoff time change. [#224](https://github.com/logstash-plugins/logstash-input-s3/pull/224)
10
+
11
+ ## 3.6.0
12
+ - Fixed unprocessed file with the same `last_modified` in ingestion. [#220](https://github.com/logstash-plugins/logstash-input-s3/pull/220)
13
+
14
+ ## 3.5.2
15
+ - [DOC]Added note that only AWS S3 is supported. No other S3 compatible storage solutions are supported. [#208](https://github.com/logstash-plugins/logstash-input-s3/issues/208)
16
+
17
+ ## 3.5.1
18
+ - [DOC]Added example for `exclude_pattern` and reordered option descriptions [#204](https://github.com/logstash-plugins/logstash-input-s3/issues/204)
19
+
1
20
  ## 3.5.0
2
21
  - Added support for including objects restored from Glacier or Glacier Deep [#199](https://github.com/logstash-plugins/logstash-input-s3/issues/199)
3
22
  - Added `gzip_pattern` option, enabling more flexible determination of whether a file is gzipped [#165](https://github.com/logstash-plugins/logstash-input-s3/issues/165)
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Logstash Plugin
2
2
 
3
- [![Travis Build Status](https://travis-ci.org/logstash-plugins/logstash-input-s3.svg)](https://travis-ci.org/logstash-plugins/logstash-input-s3)
3
+ [![Travis Build Status](https://travis-ci.com/logstash-plugins/logstash-input-s3.svg)](https://travis-ci.com/logstash-plugins/logstash-input-s3)
4
4
 
5
5
  This is a plugin for [Logstash](https://github.com/elastic/logstash).
6
6
 
data/docs/index.asciidoc CHANGED
@@ -23,11 +23,29 @@ include::{include_path}/plugin_header.asciidoc[]
23
23
 
24
24
  Stream events from files from a S3 bucket.
25
25
 
26
+ IMPORTANT: The S3 input plugin only supports AWS S3.
27
+ Other S3 compatible storage solutions are not supported.
28
+
26
29
  Each line from each file generates an event.
27
30
  Files ending in `.gz` are handled as gzip'ed files.
28
31
 
29
32
  Files that are archived to AWS Glacier will be skipped.
30
33
 
34
+ [id="plugins-{type}s-{plugin}-ecs_metadata"]
35
+ ==== Event Metadata and the Elastic Common Schema (ECS)
36
+ This plugin adds cloudfront metadata to event.
37
+ When ECS compatibility is disabled, the value is stored in the root level.
38
+ When ECS is enabled, the value is stored in the `@metadata` where it can be used by other plugins in your pipeline.
39
+
40
+ Here’s how ECS compatibility mode affects output.
41
+ [cols="<l,<l,e,<e"]
42
+ |=======================================================================
43
+ | ECS disabled | ECS v1 | Availability | Description
44
+
45
+ | cloudfront_fields | [@metadata][s3][cloudfront][fields] | available when the file is a CloudFront log | column names of log
46
+ | cloudfront_version | [@metadata][s3][cloudfront][version] | available when the file is a CloudFront log | version of log
47
+ |=======================================================================
48
+
31
49
  [id="plugins-{type}s-{plugin}-options"]
32
50
  ==== S3 Input Configuration Options
33
51
 
@@ -44,6 +62,7 @@ This plugin supports the following configuration options plus the <<plugins-{typ
44
62
  | <<plugins-{type}s-{plugin}-backup_to_dir>> |<<string,string>>|No
45
63
  | <<plugins-{type}s-{plugin}-bucket>> |<<string,string>>|Yes
46
64
  | <<plugins-{type}s-{plugin}-delete>> |<<boolean,boolean>>|No
65
+ | <<plugins-{type}s-{plugin}-ecs_compatibility>> |<<string,string>>|No
47
66
  | <<plugins-{type}s-{plugin}-endpoint>> |<<string,string>>|No
48
67
  | <<plugins-{type}s-{plugin}-exclude_pattern>> |<<string,string>>|No
49
68
  | <<plugins-{type}s-{plugin}-gzip_pattern>> |<<string,string>>|No
@@ -80,6 +99,29 @@ This plugin uses the AWS SDK and supports several ways to get credentials, which
80
99
  4. Environment variables `AMAZON_ACCESS_KEY_ID` and `AMAZON_SECRET_ACCESS_KEY`
81
100
  5. IAM Instance Profile (available when running inside EC2)
82
101
 
102
+
103
+ [id="plugins-{type}s-{plugin}-additional_settings"]
104
+ ===== `additional_settings`
105
+
106
+ * Value type is <<hash,hash>>
107
+ * Default value is `{}`
108
+
109
+ Key-value pairs of settings and corresponding values used to parametrize
110
+ the connection to s3. See full list in https://docs.aws.amazon.com/sdkforruby/api/Aws/S3/Client.html[the AWS SDK documentation]. Example:
111
+
112
+ [source,ruby]
113
+ input {
114
+ s3 {
115
+ access_key_id => "1234"
116
+ secret_access_key => "secret"
117
+ bucket => "logstash-test"
118
+ additional_settings => {
119
+ force_path_style => true
120
+ follow_redirects => false
121
+ }
122
+ }
123
+ }
124
+
83
125
  [id="plugins-{type}s-{plugin}-aws_credentials_file"]
84
126
  ===== `aws_credentials_file`
85
127
 
@@ -141,6 +183,18 @@ The name of the S3 bucket.
141
183
 
142
184
  Whether to delete processed files from the original bucket.
143
185
 
186
+ [id="plugins-{type}s-{plugin}-ecs_compatibility"]
187
+ ===== `ecs_compatibility`
188
+
189
+ * Value type is <<string,string>>
190
+ * Supported values are:
191
+ ** `disabled`: does not use ECS-compatible field names
192
+ ** `v1`,`v8`: uses metadata fields that are compatible with Elastic Common Schema
193
+
194
+ Controls this plugin's compatibility with the
195
+ {ecs-ref}[Elastic Common Schema (ECS)].
196
+ See <<plugins-{type}s-{plugin}-ecs_metadata>> for detailed information.
197
+
144
198
  [id="plugins-{type}s-{plugin}-endpoint"]
145
199
  ===== `endpoint`
146
200
 
@@ -157,7 +211,20 @@ guaranteed to work correctly with the AWS SDK.
157
211
  * Value type is <<string,string>>
158
212
  * Default value is `nil`
159
213
 
160
- Ruby style regexp of keys to exclude from the bucket
214
+ Ruby style regexp of keys to exclude from the bucket.
215
+
216
+ Note that files matching the pattern are skipped _after_ they have been listed.
217
+ Consider using <<plugins-{type}s-{plugin}-prefix>> instead where possible.
218
+
219
+ Example:
220
+
221
+ [source,ruby]
222
+ -----
223
+ "exclude_pattern" => "\/2020\/04\/"
224
+ -----
225
+
226
+ This pattern excludes all logs containing "/2020/04/" in the path.
227
+
161
228
 
162
229
  [id="plugins-{type}s-{plugin}-gzip_pattern"]
163
230
  ===== `gzip_pattern`
@@ -167,28 +234,6 @@ Ruby style regexp of keys to exclude from the bucket
167
234
 
168
235
  Regular expression used to determine whether an input file is in gzip format.
169
236
 
170
- [id="plugins-{type}s-{plugin}-additional_settings"]
171
- ===== `additional_settings`
172
-
173
- * Value type is <<hash,hash>>
174
- * Default value is `{}`
175
-
176
- Key-value pairs of settings and corresponding values used to parametrize
177
- the connection to s3. See full list in https://docs.aws.amazon.com/sdkforruby/api/Aws/S3/Client.html[the AWS SDK documentation]. Example:
178
-
179
- [source,ruby]
180
- input {
181
- s3 {
182
- "access_key_id" => "1234"
183
- "secret_access_key" => "secret"
184
- "bucket" => "logstash-test"
185
- "additional_settings" => {
186
- "force_path_style" => true
187
- "follow_redirects" => false
188
- }
189
- }
190
- }
191
-
192
237
  [id="plugins-{type}s-{plugin}-include_object_properties"]
193
238
  ===== `include_object_properties`
194
239
 
@@ -9,6 +9,7 @@ require "stud/interval"
9
9
  require "stud/temporary"
10
10
  require "aws-sdk"
11
11
  require "logstash/inputs/s3/patch"
12
+ require "logstash/plugin_mixins/ecs_compatibility_support"
12
13
 
13
14
  require 'java'
14
15
 
@@ -27,6 +28,7 @@ class LogStash::Inputs::S3 < LogStash::Inputs::Base
27
28
  java_import java.util.zip.ZipException
28
29
 
29
30
  include LogStash::PluginMixins::AwsConfig::V2
31
+ include LogStash::PluginMixins::ECSCompatibilitySupport(:disabled, :v1, :v8 => :v1)
30
32
 
31
33
  config_name "s3"
32
34
 
@@ -86,6 +88,14 @@ class LogStash::Inputs::S3 < LogStash::Inputs::Base
86
88
  # default to an expression that matches *.gz and *.gzip file extensions
87
89
  config :gzip_pattern, :validate => :string, :default => "\.gz(ip)?$"
88
90
 
91
+ CUTOFF_SECOND = 3
92
+
93
+ def initialize(*params)
94
+ super
95
+ @cloudfront_fields_key = ecs_select[disabled: 'cloudfront_fields', v1: '[@metadata][s3][cloudfront][fields]']
96
+ @cloudfront_version_key = ecs_select[disabled: 'cloudfront_version', v1: '[@metadata][s3][cloudfront][version]']
97
+ end
98
+
89
99
  def register
90
100
  require "fileutils"
91
101
  require "digest/md5"
@@ -126,8 +136,9 @@ class LogStash::Inputs::S3 < LogStash::Inputs::Base
126
136
  end # def run
127
137
 
128
138
  def list_new_files
129
- objects = {}
139
+ objects = []
130
140
  found = false
141
+ current_time = Time.now
131
142
  begin
132
143
  @s3bucket.objects(:prefix => @prefix).each do |log|
133
144
  found = true
@@ -138,10 +149,12 @@ class LogStash::Inputs::S3 < LogStash::Inputs::Base
138
149
  @logger.debug('Object Zero Length', :key => log.key)
139
150
  elsif !sincedb.newer?(log.last_modified)
140
151
  @logger.debug('Object Not Modified', :key => log.key)
152
+ elsif log.last_modified > (current_time - CUTOFF_SECOND).utc # file modified within last two seconds will be processed in next cycle
153
+ @logger.debug('Object Modified After Cutoff Time', :key => log.key)
141
154
  elsif (log.storage_class == 'GLACIER' || log.storage_class == 'DEEP_ARCHIVE') && !file_restored?(log.object)
142
155
  @logger.debug('Object Archived to Glacier', :key => log.key)
143
156
  else
144
- objects[log.key] = log.last_modified
157
+ objects << log
145
158
  @logger.debug("Added to objects[]", :key => log.key, :length => objects.length)
146
159
  end
147
160
  end
@@ -149,7 +162,7 @@ class LogStash::Inputs::S3 < LogStash::Inputs::Base
149
162
  rescue Aws::Errors::ServiceError => e
150
163
  @logger.error("Unable to list objects in bucket", :exception => e.class, :message => e.message, :backtrace => e.backtrace, :prefix => prefix)
151
164
  end
152
- objects.keys.sort {|a,b| objects[a] <=> objects[b]}
165
+ objects.sort_by { |log| log.last_modified }
153
166
  end # def fetch_new_files
154
167
 
155
168
  def backup_to_bucket(object)
@@ -171,11 +184,11 @@ class LogStash::Inputs::S3 < LogStash::Inputs::Base
171
184
  def process_files(queue)
172
185
  objects = list_new_files
173
186
 
174
- objects.each do |key|
187
+ objects.each do |log|
175
188
  if stop?
176
189
  break
177
190
  else
178
- process_log(queue, key)
191
+ process_log(queue, log)
179
192
  end
180
193
  end
181
194
  end # def process_files
@@ -223,9 +236,6 @@ class LogStash::Inputs::S3 < LogStash::Inputs::Base
223
236
  else
224
237
  decorate(event)
225
238
 
226
- event.set("cloudfront_version", metadata[:cloudfront_version]) unless metadata[:cloudfront_version].nil?
227
- event.set("cloudfront_fields", metadata[:cloudfront_fields]) unless metadata[:cloudfront_fields].nil?
228
-
229
239
  if @include_object_properties
230
240
  event.set("[@metadata][s3]", object.data.to_h)
231
241
  else
@@ -233,6 +243,8 @@ class LogStash::Inputs::S3 < LogStash::Inputs::Base
233
243
  end
234
244
 
235
245
  event.set("[@metadata][s3][key]", object.key)
246
+ event.set(@cloudfront_version_key, metadata[:cloudfront_version]) unless metadata[:cloudfront_version].nil?
247
+ event.set(@cloudfront_fields_key, metadata[:cloudfront_fields]) unless metadata[:cloudfront_fields].nil?
236
248
 
237
249
  queue << event
238
250
  end
@@ -341,14 +353,22 @@ class LogStash::Inputs::S3 < LogStash::Inputs::Base
341
353
  end
342
354
 
343
355
  def symbolized_settings
344
- @symbolized_settings ||= symbolize(@additional_settings)
356
+ @symbolized_settings ||= symbolize_keys_and_cast_true_false(@additional_settings)
345
357
  end
346
358
 
347
- def symbolize(hash)
348
- return hash unless hash.is_a?(Hash)
349
- symbolized = {}
350
- hash.each { |key, value| symbolized[key.to_sym] = symbolize(value) }
351
- symbolized
359
+ def symbolize_keys_and_cast_true_false(hash)
360
+ case hash
361
+ when Hash
362
+ symbolized = {}
363
+ hash.each { |key, value| symbolized[key.to_sym] = symbolize_keys_and_cast_true_false(value) }
364
+ symbolized
365
+ when 'true'
366
+ true
367
+ when 'false'
368
+ false
369
+ else
370
+ hash
371
+ end
352
372
  end
353
373
 
354
374
  def ignore_filename?(filename)
@@ -367,19 +387,22 @@ class LogStash::Inputs::S3 < LogStash::Inputs::Base
367
387
  end
368
388
  end
369
389
 
370
- def process_log(queue, key)
371
- @logger.debug("Processing", :bucket => @bucket, :key => key)
372
- object = @s3bucket.object(key)
390
+ def process_log(queue, log)
391
+ @logger.debug("Processing", :bucket => @bucket, :key => log.key)
392
+ object = @s3bucket.object(log.key)
373
393
 
374
- filename = File.join(temporary_directory, File.basename(key))
394
+ filename = File.join(temporary_directory, File.basename(log.key))
375
395
  if download_remote_file(object, filename)
376
396
  if process_local_log(queue, filename, object)
377
- lastmod = object.last_modified
378
- backup_to_bucket(object)
379
- backup_to_dir(filename)
380
- delete_file_from_bucket(object)
381
- FileUtils.remove_entry_secure(filename, true)
382
- sincedb.write(lastmod)
397
+ if object.last_modified == log.last_modified
398
+ backup_to_bucket(object)
399
+ backup_to_dir(filename)
400
+ delete_file_from_bucket(object)
401
+ FileUtils.remove_entry_secure(filename, true)
402
+ sincedb.write(log.last_modified)
403
+ else
404
+ @logger.info("#{log.key} is updated at #{object.last_modified} and will process in the next cycle")
405
+ end
383
406
  end
384
407
  else
385
408
  FileUtils.remove_entry_secure(filename, true)
@@ -1,7 +1,7 @@
1
1
  Gem::Specification.new do |s|
2
2
 
3
3
  s.name = 'logstash-input-s3'
4
- s.version = '3.5.0'
4
+ s.version = '3.8.1'
5
5
  s.licenses = ['Apache-2.0']
6
6
  s.summary = "Streams events from files in a S3 bucket"
7
7
  s.description = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program"
@@ -27,4 +27,5 @@ Gem::Specification.new do |s|
27
27
  s.add_development_dependency 'logstash-devutils'
28
28
  s.add_development_dependency "logstash-codec-json"
29
29
  s.add_development_dependency "logstash-codec-multiline"
30
+ s.add_runtime_dependency 'logstash-mixin-ecs_compatibility_support', '~>1.2'
30
31
  end
@@ -9,6 +9,7 @@ require_relative "../support/helpers"
9
9
  require "stud/temporary"
10
10
  require "aws-sdk"
11
11
  require "fileutils"
12
+ require 'logstash/plugin_mixins/ecs_compatibility_support/spec_helper'
12
13
 
13
14
  describe LogStash::Inputs::S3 do
14
15
  let(:temporary_directory) { Stud::Temporary.pathname }
@@ -24,6 +25,7 @@ describe LogStash::Inputs::S3 do
24
25
  "sincedb_path" => File.join(sincedb_path, ".sincedb")
25
26
  }
26
27
  }
28
+ let(:cutoff) { LogStash::Inputs::S3::CUTOFF_SECOND }
27
29
 
28
30
 
29
31
  before do
@@ -33,13 +35,16 @@ describe LogStash::Inputs::S3 do
33
35
  end
34
36
 
35
37
  context "when interrupting the plugin" do
36
- let(:config) { super.merge({ "interval" => 5 }) }
38
+ let(:config) { super().merge({ "interval" => 5 }) }
39
+ let(:s3_obj) { double(:key => "awesome-key", :last_modified => Time.now.round, :content_length => 10, :storage_class => 'STANDARD', :object => double(:data => double(:restore => nil)) ) }
37
40
 
38
41
  before do
39
- expect_any_instance_of(LogStash::Inputs::S3).to receive(:list_new_files).and_return(TestInfiniteS3Object.new)
42
+ expect_any_instance_of(LogStash::Inputs::S3).to receive(:list_new_files).and_return(TestInfiniteS3Object.new(s3_obj))
40
43
  end
41
44
 
42
- it_behaves_like "an interruptible input plugin"
45
+ it_behaves_like "an interruptible input plugin" do
46
+ let(:allowed_lag) { 16 } if LOGSTASH_VERSION.split('.').first.to_i <= 6
47
+ end
43
48
  end
44
49
 
45
50
  describe "#register" do
@@ -79,10 +84,10 @@ describe LogStash::Inputs::S3 do
79
84
  end
80
85
 
81
86
  describe "additional_settings" do
82
- context 'when force_path_style is set' do
87
+ context "supported settings" do
83
88
  let(:settings) {
84
89
  {
85
- "additional_settings" => { "force_path_style" => true },
90
+ "additional_settings" => { "force_path_style" => 'true', "ssl_verify_peer" => 'false', "profile" => 'logstash' },
86
91
  "bucket" => "logstash-test",
87
92
  }
88
93
  }
@@ -90,7 +95,7 @@ describe LogStash::Inputs::S3 do
90
95
  it 'should instantiate AWS::S3 clients with force_path_style set' do
91
96
  expect(Aws::S3::Resource).to receive(:new).with({
92
97
  :region => subject.region,
93
- :force_path_style => true
98
+ :force_path_style => true, :ssl_verify_peer => false, :profile => 'logstash'
94
99
  }).and_call_original
95
100
 
96
101
  subject.send(:get_s3object)
@@ -115,11 +120,12 @@ describe LogStash::Inputs::S3 do
115
120
  describe "#list_new_files" do
116
121
  before { allow_any_instance_of(Aws::S3::Bucket).to receive(:objects) { objects_list } }
117
122
 
118
- let!(:present_object) {double(:key => 'this-should-be-present', :last_modified => Time.now, :content_length => 10, :storage_class => 'STANDARD', :object => double(:data => double(:restore => nil)) ) }
119
- let!(:archived_object) {double(:key => 'this-should-be-archived', :last_modified => Time.now, :content_length => 10, :storage_class => 'GLACIER', :object => double(:data => double(:restore => nil)) ) }
120
- let!(:deep_archived_object) {double(:key => 'this-should-be-archived', :last_modified => Time.now, :content_length => 10, :storage_class => 'GLACIER', :object => double(:data => double(:restore => nil)) ) }
121
- let!(:restored_object) {double(:key => 'this-should-be-restored-from-archive', :last_modified => Time.now, :content_length => 10, :storage_class => 'GLACIER', :object => double(:data => double(:restore => 'ongoing-request="false", expiry-date="Thu, 01 Jan 2099 00:00:00 GMT"')) ) }
122
- let!(:deep_restored_object) {double(:key => 'this-should-be-restored-from-deep-archive', :last_modified => Time.now, :content_length => 10, :storage_class => 'DEEP_ARCHIVE', :object => double(:data => double(:restore => 'ongoing-request="false", expiry-date="Thu, 01 Jan 2099 00:00:00 GMT"')) ) }
123
+ let!(:present_object_after_cutoff) {double(:key => 'this-should-not-be-present', :last_modified => Time.now, :content_length => 10, :storage_class => 'STANDARD', :object => double(:data => double(:restore => nil)) ) }
124
+ let!(:present_object) {double(:key => 'this-should-be-present', :last_modified => Time.now - cutoff, :content_length => 10, :storage_class => 'STANDARD', :object => double(:data => double(:restore => nil)) ) }
125
+ let!(:archived_object) {double(:key => 'this-should-be-archived', :last_modified => Time.now - cutoff, :content_length => 10, :storage_class => 'GLACIER', :object => double(:data => double(:restore => nil)) ) }
126
+ let!(:deep_archived_object) {double(:key => 'this-should-be-archived', :last_modified => Time.now - cutoff, :content_length => 10, :storage_class => 'GLACIER', :object => double(:data => double(:restore => nil)) ) }
127
+ let!(:restored_object) {double(:key => 'this-should-be-restored-from-archive', :last_modified => Time.now - cutoff, :content_length => 10, :storage_class => 'GLACIER', :object => double(:data => double(:restore => 'ongoing-request="false", expiry-date="Thu, 01 Jan 2099 00:00:00 GMT"')) ) }
128
+ let!(:deep_restored_object) {double(:key => 'this-should-be-restored-from-deep-archive', :last_modified => Time.now - cutoff, :content_length => 10, :storage_class => 'DEEP_ARCHIVE', :object => double(:data => double(:restore => 'ongoing-request="false", expiry-date="Thu, 01 Jan 2099 00:00:00 GMT"')) ) }
123
129
  let(:objects_list) {
124
130
  [
125
131
  double(:key => 'exclude-this-file-1', :last_modified => Time.now - 2 * day, :content_length => 100, :storage_class => 'STANDARD'),
@@ -127,7 +133,8 @@ describe LogStash::Inputs::S3 do
127
133
  archived_object,
128
134
  restored_object,
129
135
  deep_restored_object,
130
- present_object
136
+ present_object,
137
+ present_object_after_cutoff
131
138
  ]
132
139
  }
133
140
 
@@ -135,7 +142,7 @@ describe LogStash::Inputs::S3 do
135
142
  plugin = LogStash::Inputs::S3.new(config.merge({ "exclude_pattern" => "^exclude" }))
136
143
  plugin.register
137
144
 
138
- files = plugin.list_new_files
145
+ files = plugin.list_new_files.map { |item| item.key }
139
146
  expect(files).to include(present_object.key)
140
147
  expect(files).to include(restored_object.key)
141
148
  expect(files).to include(deep_restored_object.key)
@@ -143,6 +150,7 @@ describe LogStash::Inputs::S3 do
143
150
  expect(files).to_not include('exclude/logstash') # matches exclude pattern
144
151
  expect(files).to_not include(archived_object.key) # archived
145
152
  expect(files).to_not include(deep_archived_object.key) # archived
153
+ expect(files).to_not include(present_object_after_cutoff.key) # after cutoff
146
154
  expect(files.size).to eq(3)
147
155
  end
148
156
 
@@ -150,7 +158,7 @@ describe LogStash::Inputs::S3 do
150
158
  plugin = LogStash::Inputs::S3.new(config)
151
159
  plugin.register
152
160
 
153
- files = plugin.list_new_files
161
+ files = plugin.list_new_files.map { |item| item.key }
154
162
  expect(files).to include(present_object.key)
155
163
  expect(files).to include(restored_object.key)
156
164
  expect(files).to include(deep_restored_object.key)
@@ -158,6 +166,7 @@ describe LogStash::Inputs::S3 do
158
166
  expect(files).to include('exclude/logstash') # no exclude pattern given
159
167
  expect(files).to_not include(archived_object.key) # archived
160
168
  expect(files).to_not include(deep_archived_object.key) # archived
169
+ expect(files).to_not include(present_object_after_cutoff.key) # after cutoff
161
170
  expect(files.size).to eq(5)
162
171
  end
163
172
 
@@ -204,7 +213,7 @@ describe LogStash::Inputs::S3 do
204
213
  'backup_to_bucket' => config['bucket']}))
205
214
  plugin.register
206
215
 
207
- files = plugin.list_new_files
216
+ files = plugin.list_new_files.map { |item| item.key }
208
217
  expect(files).to include(present_object.key)
209
218
  expect(files).to_not include('mybackup-log-1') # matches backup prefix
210
219
  expect(files.size).to eq(1)
@@ -218,7 +227,7 @@ describe LogStash::Inputs::S3 do
218
227
  allow_any_instance_of(LogStash::Inputs::S3::SinceDB::File).to receive(:read).and_return(Time.now - day)
219
228
  plugin.register
220
229
 
221
- files = plugin.list_new_files
230
+ files = plugin.list_new_files.map { |item| item.key }
222
231
  expect(files).to include(present_object.key)
223
232
  expect(files).to include(restored_object.key)
224
233
  expect(files).to include(deep_restored_object.key)
@@ -226,6 +235,7 @@ describe LogStash::Inputs::S3 do
226
235
  expect(files).to_not include('exclude/logstash') # too old
227
236
  expect(files).to_not include(archived_object.key) # archived
228
237
  expect(files).to_not include(deep_archived_object.key) # archived
238
+ expect(files).to_not include(present_object_after_cutoff.key) # after cutoff
229
239
  expect(files.size).to eq(3)
230
240
  end
231
241
 
@@ -241,13 +251,14 @@ describe LogStash::Inputs::S3 do
241
251
 
242
252
  plugin = LogStash::Inputs::S3.new(config.merge({ 'prefix' => prefix }))
243
253
  plugin.register
244
- expect(plugin.list_new_files).to eq([present_object.key])
254
+ expect(plugin.list_new_files.map { |item| item.key }).to eq([present_object.key])
245
255
  end
246
256
 
247
257
  it 'should sort return object sorted by last_modification date with older first' do
248
258
  objects = [
249
259
  double(:key => 'YESTERDAY', :last_modified => Time.now - day, :content_length => 5, :storage_class => 'STANDARD'),
250
260
  double(:key => 'TODAY', :last_modified => Time.now, :content_length => 5, :storage_class => 'STANDARD'),
261
+ double(:key => 'TODAY_BEFORE_CUTOFF', :last_modified => Time.now - cutoff, :content_length => 5, :storage_class => 'STANDARD'),
251
262
  double(:key => 'TWO_DAYS_AGO', :last_modified => Time.now - 2 * day, :content_length => 5, :storage_class => 'STANDARD')
252
263
  ]
253
264
 
@@ -256,7 +267,7 @@ describe LogStash::Inputs::S3 do
256
267
 
257
268
  plugin = LogStash::Inputs::S3.new(config)
258
269
  plugin.register
259
- expect(plugin.list_new_files).to eq(['TWO_DAYS_AGO', 'YESTERDAY', 'TODAY'])
270
+ expect(plugin.list_new_files.map { |item| item.key }).to eq(['TWO_DAYS_AGO', 'YESTERDAY', 'TODAY_BEFORE_CUTOFF'])
260
271
  end
261
272
 
262
273
  describe "when doing backup on the s3" do
@@ -451,7 +462,7 @@ describe LogStash::Inputs::S3 do
451
462
  end
452
463
 
453
464
  context 'compressed with gzip extension and using custom gzip_pattern option' do
454
- let(:config) { super.merge({ "gzip_pattern" => "gee.zip$" }) }
465
+ let(:config) { super().merge({ "gzip_pattern" => "gee.zip$" }) }
455
466
  let(:log) { double(:key => 'log.gee.zip', :last_modified => Time.now - 2 * day, :content_length => 5, :storage_class => 'STANDARD') }
456
467
  let(:log_file) { File.join(File.dirname(__FILE__), '..', 'fixtures', 'compressed.log.gee.zip') }
457
468
  include_examples "generated events"
@@ -486,12 +497,20 @@ describe LogStash::Inputs::S3 do
486
497
  context 'cloudfront' do
487
498
  let(:log_file) { File.join(File.dirname(__FILE__), '..', 'fixtures', 'cloudfront.log') }
488
499
 
489
- it 'should extract metadata from cloudfront log' do
490
- events = fetch_events(config)
500
+ describe "metadata", :ecs_compatibility_support, :aggregate_failures do
501
+ ecs_compatibility_matrix(:disabled, :v1) do |ecs_select|
502
+ before(:each) do
503
+ allow_any_instance_of(described_class).to receive(:ecs_compatibility).and_return(ecs_compatibility)
504
+ end
491
505
 
492
- events.each do |event|
493
- expect(event.get('cloudfront_fields')).to eq('date time x-edge-location c-ip x-event sc-bytes x-cf-status x-cf-client-id cs-uri-stem cs-uri-query c-referrer x-page-url​ c-user-agent x-sname x-sname-query x-file-ext x-sid')
494
- expect(event.get('cloudfront_version')).to eq('1.0')
506
+ it 'should extract metadata from cloudfront log' do
507
+ events = fetch_events(config)
508
+
509
+ events.each do |event|
510
+ expect(event.get ecs_select[disabled: "cloudfront_fields", v1: "[@metadata][s3][cloudfront][fields]"] ).to eq('date time x-edge-location c-ip x-event sc-bytes x-cf-status x-cf-client-id cs-uri-stem cs-uri-query c-referrer x-page-url​ c-user-agent x-sname x-sname-query x-file-ext x-sid')
511
+ expect(event.get ecs_select[disabled: "cloudfront_version", v1: "[@metadata][s3][cloudfront][version]"] ).to eq('1.0')
512
+ end
513
+ end
495
514
  end
496
515
  end
497
516
 
@@ -499,7 +518,7 @@ describe LogStash::Inputs::S3 do
499
518
  end
500
519
 
501
520
  context 'when include_object_properties is set to true' do
502
- let(:config) { super.merge({ "include_object_properties" => true }) }
521
+ let(:config) { super().merge({ "include_object_properties" => true }) }
503
522
  let(:log_file) { File.join(File.dirname(__FILE__), '..', 'fixtures', 'uncompressed.log') }
504
523
 
505
524
  it 'should extract object properties onto [@metadata][s3]' do
@@ -513,7 +532,7 @@ describe LogStash::Inputs::S3 do
513
532
  end
514
533
 
515
534
  context 'when include_object_properties is set to false' do
516
- let(:config) { super.merge({ "include_object_properties" => false }) }
535
+ let(:config) { super().merge({ "include_object_properties" => false }) }
517
536
  let(:log_file) { File.join(File.dirname(__FILE__), '..', 'fixtures', 'uncompressed.log') }
518
537
 
519
538
  it 'should NOT extract object properties onto [@metadata][s3]' do
@@ -525,6 +544,67 @@ describe LogStash::Inputs::S3 do
525
544
 
526
545
  include_examples "generated events"
527
546
  end
547
+ end
548
+
549
+ describe "data loss" do
550
+ let(:s3_plugin) { LogStash::Inputs::S3.new(config) }
551
+ let(:queue) { [] }
552
+
553
+ before do
554
+ s3_plugin.register
555
+ end
556
+
557
+ context 'events come after cutoff time' do
558
+ it 'should be processed in next cycle' do
559
+ s3_objects = [
560
+ double(:key => 'TWO_DAYS_AGO', :last_modified => Time.now.round - 2 * day, :content_length => 5, :storage_class => 'STANDARD'),
561
+ double(:key => 'YESTERDAY', :last_modified => Time.now.round - day, :content_length => 5, :storage_class => 'STANDARD'),
562
+ double(:key => 'TODAY_BEFORE_CUTOFF', :last_modified => Time.now.round - cutoff, :content_length => 5, :storage_class => 'STANDARD'),
563
+ double(:key => 'TODAY', :last_modified => Time.now.round, :content_length => 5, :storage_class => 'STANDARD'),
564
+ double(:key => 'TODAY', :last_modified => Time.now.round, :content_length => 5, :storage_class => 'STANDARD')
565
+ ]
566
+ size = s3_objects.length
567
+
568
+ allow_any_instance_of(Aws::S3::Bucket).to receive(:objects) { s3_objects }
569
+ allow_any_instance_of(Aws::S3::Bucket).to receive(:object).and_return(*s3_objects)
570
+ expect(s3_plugin).to receive(:process_log).at_least(size).and_call_original
571
+ expect(s3_plugin).to receive(:stop?).and_return(false).at_least(size)
572
+ expect(s3_plugin).to receive(:download_remote_file).and_return(true).at_least(size)
573
+ expect(s3_plugin).to receive(:process_local_log).and_return(true).at_least(size)
574
+
575
+ # first iteration
576
+ s3_plugin.process_files(queue)
577
+
578
+ # second iteration
579
+ sleep(cutoff + 1)
580
+ s3_plugin.process_files(queue)
581
+ end
582
+ end
583
+
584
+ context 's3 object updated after getting summary' do
585
+ it 'should not update sincedb' do
586
+ s3_summary = [
587
+ double(:key => 'YESTERDAY', :last_modified => Time.now.round - day, :content_length => 5, :storage_class => 'STANDARD'),
588
+ double(:key => 'TODAY', :last_modified => Time.now.round - (cutoff * 10), :content_length => 5, :storage_class => 'STANDARD')
589
+ ]
528
590
 
591
+ s3_objects = [
592
+ double(:key => 'YESTERDAY', :last_modified => Time.now.round - day, :content_length => 5, :storage_class => 'STANDARD'),
593
+ double(:key => 'TODAY_UPDATED', :last_modified => Time.now.round, :content_length => 5, :storage_class => 'STANDARD')
594
+ ]
595
+
596
+ size = s3_objects.length
597
+
598
+ allow_any_instance_of(Aws::S3::Bucket).to receive(:objects) { s3_summary }
599
+ allow_any_instance_of(Aws::S3::Bucket).to receive(:object).and_return(*s3_objects)
600
+ expect(s3_plugin).to receive(:process_log).at_least(size).and_call_original
601
+ expect(s3_plugin).to receive(:stop?).and_return(false).at_least(size)
602
+ expect(s3_plugin).to receive(:download_remote_file).and_return(true).at_least(size)
603
+ expect(s3_plugin).to receive(:process_local_log).and_return(true).at_least(size)
604
+
605
+ s3_plugin.process_files(queue)
606
+ expect(s3_plugin.send(:sincedb).read).to eq(s3_summary[0].last_modified)
607
+ end
608
+ end
529
609
  end
530
610
  end
@@ -10,6 +10,7 @@ describe LogStash::Inputs::S3, :integration => true, :s3 => true do
10
10
 
11
11
  upload_file('../fixtures/uncompressed.log' , "#{prefix}uncompressed_1.log")
12
12
  upload_file('../fixtures/compressed.log.gz', "#{prefix}compressed_1.log.gz")
13
+ sleep(LogStash::Inputs::S3::CUTOFF_SECOND + 1)
13
14
  end
14
15
 
15
16
  after do
@@ -28,6 +29,7 @@ describe LogStash::Inputs::S3, :integration => true, :s3 => true do
28
29
  "prefix" => prefix,
29
30
  "temporary_directory" => temporary_directory } }
30
31
  let(:backup_prefix) { "backup/" }
32
+ let(:backup_bucket) { "logstash-s3-input-backup" }
31
33
 
32
34
  it "support prefix to scope the remote files" do
33
35
  events = fetch_events(minimal_settings)
@@ -49,13 +51,17 @@ describe LogStash::Inputs::S3, :integration => true, :s3 => true do
49
51
  end
50
52
 
51
53
  context "remote backup" do
54
+ before do
55
+ create_bucket(backup_bucket)
56
+ end
57
+
52
58
  it "another bucket" do
53
- fetch_events(minimal_settings.merge({ "backup_to_bucket" => "logstash-s3-input-backup"}))
54
- expect(list_remote_files("", "logstash-s3-input-backup").size).to eq(2)
59
+ fetch_events(minimal_settings.merge({ "backup_to_bucket" => backup_bucket}))
60
+ expect(list_remote_files("", backup_bucket).size).to eq(2)
55
61
  end
56
62
 
57
63
  after do
58
- delete_bucket("logstash-s3-input-backup")
64
+ delete_bucket(backup_bucket)
59
65
  end
60
66
  end
61
67
  end
@@ -23,6 +23,10 @@ def list_remote_files(prefix, target_bucket = ENV['AWS_LOGSTASH_TEST_BUCKET'])
23
23
  bucket.objects(:prefix => prefix).collect(&:key)
24
24
  end
25
25
 
26
+ def create_bucket(name)
27
+ s3object.bucket(name).create
28
+ end
29
+
26
30
  def delete_bucket(name)
27
31
  s3object.bucket(name).objects.map(&:delete)
28
32
  s3object.bucket(name).delete
@@ -33,13 +37,16 @@ def s3object
33
37
  end
34
38
 
35
39
  class TestInfiniteS3Object
40
+ def initialize(s3_obj)
41
+ @s3_obj = s3_obj
42
+ end
43
+
36
44
  def each
37
45
  counter = 1
38
46
 
39
47
  loop do
40
- yield "awesome-#{counter}"
48
+ yield @s3_obj
41
49
  counter +=1
42
50
  end
43
51
  end
44
- end
45
-
52
+ end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: logstash-input-s3
3
3
  version: !ruby/object:Gem::Version
4
- version: 3.5.0
4
+ version: 3.8.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Elastic
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-03-19 00:00:00.000000000 Z
11
+ date: 2021-10-13 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  requirement: !ruby/object:Gem::Requirement
@@ -100,6 +100,20 @@ dependencies:
100
100
  - - ">="
101
101
  - !ruby/object:Gem::Version
102
102
  version: '0'
103
+ - !ruby/object:Gem::Dependency
104
+ requirement: !ruby/object:Gem::Requirement
105
+ requirements:
106
+ - - "~>"
107
+ - !ruby/object:Gem::Version
108
+ version: '1.2'
109
+ name: logstash-mixin-ecs_compatibility_support
110
+ prerelease: false
111
+ type: :runtime
112
+ version_requirements: !ruby/object:Gem::Requirement
113
+ requirements:
114
+ - - "~>"
115
+ - !ruby/object:Gem::Version
116
+ version: '1.2'
103
117
  description: This gem is a Logstash plugin required to be installed on top of the
104
118
  Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This
105
119
  gem is not a stand-alone program
@@ -153,8 +167,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
153
167
  - !ruby/object:Gem::Version
154
168
  version: '0'
155
169
  requirements: []
156
- rubyforge_project:
157
- rubygems_version: 2.6.13
170
+ rubygems_version: 3.1.6
158
171
  signing_key:
159
172
  specification_version: 4
160
173
  summary: Streams events from files in a S3 bucket