logstash-output-s3 4.3.5 → 4.4.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +16 -6
- data/README.md +1 -1
- data/VERSION +1 -0
- data/docs/index.asciidoc +13 -3
- data/lib/logstash/outputs/s3/file_repository.rb +11 -11
- data/lib/logstash/outputs/s3/size_rotation_policy.rb +1 -1
- data/lib/logstash/outputs/s3/temporary_file.rb +48 -5
- data/lib/logstash/outputs/s3/temporary_file_factory.rb +1 -4
- data/lib/logstash/outputs/s3/uploader.rb +2 -0
- data/lib/logstash/outputs/s3.rb +58 -21
- data/lib/logstash-output-s3_jars.rb +4 -0
- data/lib/tasks/build.rake +15 -0
- data/logstash-output-s3.gemspec +2 -2
- data/spec/integration/restore_from_crash_spec.rb +69 -4
- data/spec/outputs/s3/file_repository_spec.rb +7 -2
- data/spec/outputs/s3/size_rotation_policy_spec.rb +2 -2
- data/spec/supports/helpers.rb +3 -1
- data/vendor/jar-dependencies/org/logstash/plugins/outputs/s3/logstash-output-s3/4.4.0/logstash-output-s3-4.4.0.jar +0 -0
- metadata +7 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 7fe328033b222f10871103a51430bfe6f6a269f15460f70e72e423b6b135927e
|
4
|
+
data.tar.gz: e845c48187f640a948624f7da5bf3b6cec3eee3b25c81f0024014494a71363e5
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 9c5a89d3551c5d199b0b289c17b0a36138c4efbb4c4191610167039aab4692588a0c9fc17a158af788ca704d9fbd454a9f2de501b2aa34d17f5aee5820345829
|
7
|
+
data.tar.gz: 989eef2c121767e315199177f0d672bea2b2ff18fcb6a5a9acf21e8b813adadbcd35ab0ce221eff4a633bdaf735e9267b7943a3fbefb788dee42bdc4d1df293a
|
data/CHANGELOG.md
CHANGED
@@ -1,20 +1,30 @@
|
|
1
|
+
## 4.4.0
|
2
|
+
- Logstash recovers corrupted gzip and uploads to S3 [#249](https://github.com/logstash-plugins/logstash-output-s3/pull/249)
|
3
|
+
|
4
|
+
## 4.3.7
|
5
|
+
- Refactor: avoid usage of CHM (JRuby 9.3.4 work-around) [#248](https://github.com/logstash-plugins/logstash-output-s3/pull/248)
|
6
|
+
|
7
|
+
## 4.3.6
|
8
|
+
- Docs: more documentation on restore + temp dir [#236](https://github.com/logstash-plugins/logstash-output-s3/pull/236)
|
9
|
+
* minor logging improvements - use the same path: naming convention
|
10
|
+
|
1
11
|
## 4.3.5
|
2
|
-
-
|
12
|
+
- Feat: cast true/false values for additional_settings [#241](https://github.com/logstash-plugins/logstash-output-s3/pull/241)
|
3
13
|
|
4
14
|
## 4.3.4
|
5
|
-
-
|
15
|
+
- [DOC] Added note about performance implications of interpolated strings in prefixes [#233](https://github.com/logstash-plugins/logstash-output-s3/pull/233)
|
6
16
|
|
7
17
|
## 4.3.3
|
8
|
-
-
|
18
|
+
- [DOC] Updated links to use shared attributes [#230](https://github.com/logstash-plugins/logstash-output-s3/pull/230)
|
9
19
|
|
10
20
|
## 4.3.2
|
11
|
-
-
|
21
|
+
- [DOC] Added note that only AWS S3 is supported. No other S3 compatible storage solutions are supported. [#223](https://github.com/logstash-plugins/logstash-output-s3/pull/223)
|
12
22
|
|
13
23
|
## 4.3.1
|
14
|
-
-
|
24
|
+
- [DOC] Updated setting descriptions for clarity [#219](https://github.com/logstash-plugins/logstash-output-s3/pull/219) and [#220](https://github.com/logstash-plugins/logstash-output-s3/pull/220)
|
15
25
|
|
16
26
|
## 4.3.0
|
17
|
-
-
|
27
|
+
- Feat: Added retry_count and retry_delay config [#218](https://github.com/logstash-plugins/logstash-output-s3/pull/218)
|
18
28
|
|
19
29
|
## 4.2.0
|
20
30
|
- Added ability to specify [ONEZONE_IA](https://aws.amazon.com/s3/storage-classes/#__) as storage_class
|
data/README.md
CHANGED
@@ -19,7 +19,7 @@ Need help? Try #logstash on freenode IRC or the https://discuss.elastic.co/c/log
|
|
19
19
|
|
20
20
|
## Developing
|
21
21
|
|
22
|
-
### 1. Plugin
|
22
|
+
### 1. Plugin Development and Testing
|
23
23
|
|
24
24
|
#### Code
|
25
25
|
- To get started, you'll need JRuby with the Bundler gem installed.
|
data/VERSION
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
4.4.0
|
data/docs/index.asciidoc
CHANGED
@@ -30,8 +30,9 @@ Other S3 compatible storage solutions are not supported.
|
|
30
30
|
S3 outputs create temporary files into the OS' temporary directory.
|
31
31
|
You can specify where to save them using the `temporary_directory` option.
|
32
32
|
|
33
|
-
IMPORTANT: For configurations containing multiple s3 outputs with the restore
|
34
|
-
option enabled, each output should define its own
|
33
|
+
IMPORTANT: For configurations containing multiple s3 outputs with the `restore`
|
34
|
+
option enabled, each output should define its own `temporary_directory`.
|
35
|
+
Shared or nested directories can cause data loss upon recovery.
|
35
36
|
|
36
37
|
===== Requirements
|
37
38
|
|
@@ -255,6 +256,10 @@ The AWS Region
|
|
255
256
|
Used to enable recovery after crash/abnormal termination.
|
256
257
|
Temporary log files will be recovered and uploaded.
|
257
258
|
|
259
|
+
NOTE: If you're using multiple S3 outputs, always set
|
260
|
+
<<plugins-{type}s-{plugin}-temporary_directory>> to a
|
261
|
+
unique directory. Otherwise the recovery mechanism won't work correctly.
|
262
|
+
|
258
263
|
[id="plugins-{type}s-{plugin}-retry_count"]
|
259
264
|
===== `retry_count`
|
260
265
|
|
@@ -388,7 +393,12 @@ Defaults to STANDARD.
|
|
388
393
|
* Default value is `"/tmp/logstash"`
|
389
394
|
|
390
395
|
Set the directory where logstash will store the tmp files before sending it to S3
|
391
|
-
default to the current OS temporary directory in linux
|
396
|
+
default to the current OS temporary directory in linux `/tmp/logstash`.
|
397
|
+
|
398
|
+
WARNING: Using multiple S3 outputs with `restore => true` requires unique directories
|
399
|
+
per output. All of the directory's contents are processed and deleted upon recovery, and shared or nested directories can cause data loss.
|
400
|
+
For example, an output using `/tmp/s3` and a second configured with `/tmp/s3/sub` would
|
401
|
+
cause issues. Having temporary directories `/tmp/s3/sub1` and `/tmp/s3/sub2` is fine.
|
392
402
|
|
393
403
|
[id="plugins-{type}s-{plugin}-time_file"]
|
394
404
|
===== `time_file`
|
@@ -1,11 +1,9 @@
|
|
1
1
|
# encoding: utf-8
|
2
2
|
require "java"
|
3
|
-
require "concurrent"
|
3
|
+
require "concurrent/map"
|
4
4
|
require "concurrent/timer_task"
|
5
5
|
require "logstash/util"
|
6
6
|
|
7
|
-
ConcurrentHashMap = java.util.concurrent.ConcurrentHashMap
|
8
|
-
|
9
7
|
module LogStash
|
10
8
|
module Outputs
|
11
9
|
class S3
|
@@ -41,7 +39,7 @@ module LogStash
|
|
41
39
|
end
|
42
40
|
|
43
41
|
class FactoryInitializer
|
44
|
-
|
42
|
+
|
45
43
|
def initialize(tags, encoding, temporary_directory, stale_time)
|
46
44
|
@tags = tags
|
47
45
|
@encoding = encoding
|
@@ -49,9 +47,10 @@ module LogStash
|
|
49
47
|
@stale_time = stale_time
|
50
48
|
end
|
51
49
|
|
52
|
-
def
|
50
|
+
def create_value(prefix_key)
|
53
51
|
PrefixedValue.new(TemporaryFileFactory.new(prefix_key, @tags, @encoding, @temporary_directory), @stale_time)
|
54
52
|
end
|
53
|
+
|
55
54
|
end
|
56
55
|
|
57
56
|
def initialize(tags, encoding, temporary_directory,
|
@@ -59,7 +58,7 @@ module LogStash
|
|
59
58
|
sweeper_interval = DEFAULT_STATE_SWEEPER_INTERVAL_SECS)
|
60
59
|
# The path need to contains the prefix so when we start
|
61
60
|
# logtash after a crash we keep the remote structure
|
62
|
-
@prefixed_factories =
|
61
|
+
@prefixed_factories = Concurrent::Map.new
|
63
62
|
|
64
63
|
@sweeper_interval = sweeper_interval
|
65
64
|
|
@@ -69,18 +68,19 @@ module LogStash
|
|
69
68
|
end
|
70
69
|
|
71
70
|
def keys
|
72
|
-
@prefixed_factories.
|
71
|
+
@prefixed_factories.keys
|
73
72
|
end
|
74
73
|
|
75
74
|
def each_files
|
76
|
-
@prefixed_factories.
|
75
|
+
@prefixed_factories.values.each do |prefixed_file|
|
77
76
|
prefixed_file.with_lock { |factory| yield factory.current }
|
78
77
|
end
|
79
78
|
end
|
80
79
|
|
81
80
|
# Return the file factory
|
82
81
|
def get_factory(prefix_key)
|
83
|
-
@prefixed_factories.
|
82
|
+
prefix_val = @prefixed_factories.fetch_or_store(prefix_key) { @factory_initializer.create_value(prefix_key) }
|
83
|
+
prefix_val.with_lock { |factory| yield factory }
|
84
84
|
end
|
85
85
|
|
86
86
|
def get_file(prefix_key)
|
@@ -97,7 +97,7 @@ module LogStash
|
|
97
97
|
|
98
98
|
def remove_stale(k, v)
|
99
99
|
if v.stale?
|
100
|
-
@prefixed_factories.
|
100
|
+
@prefixed_factories.delete_pair(k, v)
|
101
101
|
v.delete!
|
102
102
|
end
|
103
103
|
end
|
@@ -106,7 +106,7 @@ module LogStash
|
|
106
106
|
@stale_sweeper = Concurrent::TimerTask.new(:execution_interval => @sweeper_interval) do
|
107
107
|
LogStash::Util.set_thread_name("S3, Stale factory sweeper")
|
108
108
|
|
109
|
-
@prefixed_factories.
|
109
|
+
@prefixed_factories.each { |k, v| remove_stale(k,v) }
|
110
110
|
end
|
111
111
|
|
112
112
|
@stale_sweeper.execute
|
@@ -2,15 +2,23 @@
|
|
2
2
|
require "thread"
|
3
3
|
require "forwardable"
|
4
4
|
require "fileutils"
|
5
|
+
require "logstash-output-s3_jars"
|
5
6
|
|
6
7
|
module LogStash
|
7
8
|
module Outputs
|
8
9
|
class S3
|
9
|
-
|
10
|
-
|
10
|
+
|
11
|
+
java_import 'org.logstash.plugins.outputs.s3.GzipUtil'
|
12
|
+
|
13
|
+
# Wrap the actual file descriptor into an utility class
|
14
|
+
# Make it more OOP and easier to reason with the paths.
|
11
15
|
class TemporaryFile
|
12
16
|
extend Forwardable
|
13
17
|
|
18
|
+
GZIP_EXTENSION = "txt.gz"
|
19
|
+
TXT_EXTENSION = "txt"
|
20
|
+
RECOVERED_FILE_NAME_TAG = "-recovered"
|
21
|
+
|
14
22
|
def_delegators :@fd, :path, :write, :close, :fsync
|
15
23
|
|
16
24
|
attr_reader :fd
|
@@ -33,8 +41,10 @@ module LogStash
|
|
33
41
|
def size
|
34
42
|
# Use the fd size to get the accurate result,
|
35
43
|
# so we dont have to deal with fsync
|
36
|
-
# if the file is close we
|
44
|
+
# if the file is close, fd.size raises an IO exception so we use the File::size
|
37
45
|
begin
|
46
|
+
# fd is nil when LS tries to recover gzip file but fails
|
47
|
+
return 0 unless @fd != nil
|
38
48
|
@fd.size
|
39
49
|
rescue IOError
|
40
50
|
::File.size(path)
|
@@ -45,7 +55,7 @@ module LogStash
|
|
45
55
|
@key.gsub(/^\//, "")
|
46
56
|
end
|
47
57
|
|
48
|
-
# Each temporary file is
|
58
|
+
# Each temporary file is created inside a directory named with an UUID,
|
49
59
|
# instead of deleting the file directly and having the risk of deleting other files
|
50
60
|
# we delete the root of the UUID, using a UUID also remove the risk of deleting unwanted file, it acts as
|
51
61
|
# a sandbox.
|
@@ -58,13 +68,46 @@ module LogStash
|
|
58
68
|
size == 0
|
59
69
|
end
|
60
70
|
|
71
|
+
# only to cover the case where LS cannot restore corrupted file, file is not exist
|
72
|
+
def recoverable?
|
73
|
+
!@fd.nil?
|
74
|
+
end
|
75
|
+
|
61
76
|
def self.create_from_existing_file(file_path, temporary_folder)
|
62
77
|
key_parts = Pathname.new(file_path).relative_path_from(temporary_folder).to_s.split(::File::SEPARATOR)
|
63
78
|
|
79
|
+
# recover gzip file and compress back before uploading to S3
|
80
|
+
if file_path.end_with?("." + GZIP_EXTENSION)
|
81
|
+
file_path = self.recover(file_path)
|
82
|
+
end
|
64
83
|
TemporaryFile.new(key_parts.slice(1, key_parts.size).join("/"),
|
65
|
-
::File.open(file_path, "r"),
|
84
|
+
::File.exist?(file_path) ? ::File.open(file_path, "r") : nil, # for the nil case, file size will be 0 and upload will be ignored.
|
66
85
|
::File.join(temporary_folder, key_parts.slice(0, 1)))
|
67
86
|
end
|
87
|
+
|
88
|
+
def self.gzip_extension
|
89
|
+
GZIP_EXTENSION
|
90
|
+
end
|
91
|
+
|
92
|
+
def self.text_extension
|
93
|
+
TXT_EXTENSION
|
94
|
+
end
|
95
|
+
|
96
|
+
def self.recovery_file_name_tag
|
97
|
+
RECOVERED_FILE_NAME_TAG
|
98
|
+
end
|
99
|
+
|
100
|
+
private
|
101
|
+
def self.recover(file_path)
|
102
|
+
full_gzip_extension = "." + GZIP_EXTENSION
|
103
|
+
recovered_txt_file_path = file_path.gsub(full_gzip_extension, RECOVERED_FILE_NAME_TAG + "." + TXT_EXTENSION)
|
104
|
+
recovered_gzip_file_path = file_path.gsub(full_gzip_extension, RECOVERED_FILE_NAME_TAG + full_gzip_extension)
|
105
|
+
GzipUtil.recover(file_path, recovered_txt_file_path)
|
106
|
+
if ::File.exist?(recovered_txt_file_path) && !::File.zero?(recovered_txt_file_path)
|
107
|
+
GzipUtil.compress(recovered_txt_file_path, recovered_gzip_file_path)
|
108
|
+
end
|
109
|
+
recovered_gzip_file_path
|
110
|
+
end
|
68
111
|
end
|
69
112
|
end
|
70
113
|
end
|
@@ -19,9 +19,6 @@ module LogStash
|
|
19
19
|
# I do not have to mess around to check if the other directory have file in it before destroying them.
|
20
20
|
class TemporaryFileFactory
|
21
21
|
FILE_MODE = "a"
|
22
|
-
GZIP_ENCODING = "gzip"
|
23
|
-
GZIP_EXTENSION = "txt.gz"
|
24
|
-
TXT_EXTENSION = "txt"
|
25
22
|
STRFTIME = "%Y-%m-%dT%H.%M"
|
26
23
|
|
27
24
|
attr_accessor :counter, :tags, :prefix, :encoding, :temporary_directory, :current
|
@@ -48,7 +45,7 @@ module LogStash
|
|
48
45
|
|
49
46
|
private
|
50
47
|
def extension
|
51
|
-
gzip? ?
|
48
|
+
gzip? ? TemporaryFile.gzip_extension : TemporaryFile.text_extension
|
52
49
|
end
|
53
50
|
|
54
51
|
def gzip?
|
@@ -31,6 +31,7 @@ module LogStash
|
|
31
31
|
end
|
32
32
|
end
|
33
33
|
|
34
|
+
# uploads a TemporaryFile to S3
|
34
35
|
def upload(file, options = {})
|
35
36
|
upload_options = options.fetch(:upload_options, {})
|
36
37
|
|
@@ -68,6 +69,7 @@ module LogStash
|
|
68
69
|
@workers_pool.shutdown
|
69
70
|
@workers_pool.wait_for_termination(nil) # block until its done
|
70
71
|
end
|
72
|
+
|
71
73
|
end
|
72
74
|
end
|
73
75
|
end
|
data/lib/logstash/outputs/s3.rb
CHANGED
@@ -97,6 +97,7 @@ class LogStash::Outputs::S3 < LogStash::Outputs::Base
|
|
97
97
|
:fallback_policy => :caller_runs
|
98
98
|
})
|
99
99
|
|
100
|
+
GZIP_ENCODING = "gzip"
|
100
101
|
|
101
102
|
config_name "s3"
|
102
103
|
default :codec, "line"
|
@@ -110,7 +111,8 @@ class LogStash::Outputs::S3 < LogStash::Outputs::Base
|
|
110
111
|
|
111
112
|
# Set the size of file in bytes, this means that files on bucket when have dimension > file_size, they are stored in two or more file.
|
112
113
|
# If you have tags then it will generate a specific size file for every tags
|
113
|
-
|
114
|
+
#
|
115
|
+
# NOTE: define size of file is the better thing, because generate a local temporary file on disk and then put it in bucket.
|
114
116
|
config :size_file, :validate => :number, :default => 1024 * 1024 * 5
|
115
117
|
|
116
118
|
# Set the time, in MINUTES, to close the current sub_time_section of bucket.
|
@@ -118,10 +120,10 @@ class LogStash::Outputs::S3 < LogStash::Outputs::Base
|
|
118
120
|
# If it's valued 0 and rotation_strategy is 'time' or 'size_and_time' then the plugin reaise a configuration error.
|
119
121
|
config :time_file, :validate => :number, :default => 15
|
120
122
|
|
121
|
-
|
122
|
-
|
123
|
-
|
124
|
-
|
123
|
+
# If `restore => false` is specified and Logstash crashes, the unprocessed files are not sent into the bucket.
|
124
|
+
#
|
125
|
+
# NOTE: that the `recovery => true` default assumes multiple S3 outputs would set a unique `temporary_directory => ...`
|
126
|
+
# if they do not than only a single S3 output is safe to recover (since let-over files are processed and deleted).
|
125
127
|
config :restore, :validate => :boolean, :default => true
|
126
128
|
|
127
129
|
# The S3 canned ACL to use when putting the file. Defaults to "private".
|
@@ -147,6 +149,9 @@ class LogStash::Outputs::S3 < LogStash::Outputs::Base
|
|
147
149
|
|
148
150
|
# Set the directory where logstash will store the tmp files before sending it to S3
|
149
151
|
# default to the current OS temporary directory in linux /tmp/logstash
|
152
|
+
#
|
153
|
+
# NOTE: the reason we do not have a unique (isolated) temporary directory as a default, to support multiple plugin instances,
|
154
|
+
# is that we would have to rely on something static that does not change between restarts (e.g. a user set id => ...).
|
150
155
|
config :temporary_directory, :validate => :string, :default => File.join(Dir.tmpdir, "logstash")
|
151
156
|
|
152
157
|
# Specify a prefix to the uploaded filename, this can simulate directories on S3. Prefix does not require leading slash.
|
@@ -177,7 +182,7 @@ class LogStash::Outputs::S3 < LogStash::Outputs::Base
|
|
177
182
|
config :tags, :validate => :array, :default => []
|
178
183
|
|
179
184
|
# Specify the content encoding. Supports ("gzip"). Defaults to "none"
|
180
|
-
config :encoding, :validate => ["none",
|
185
|
+
config :encoding, :validate => ["none", GZIP_ENCODING], :default => "none"
|
181
186
|
|
182
187
|
# Define the strategy to use to decide when we need to rotate the file and push it to S3,
|
183
188
|
# The default strategy is to check for both size and time, the first one to match will rotate the file.
|
@@ -311,7 +316,7 @@ class LogStash::Outputs::S3 < LogStash::Outputs::Base
|
|
311
316
|
:server_side_encryption => @server_side_encryption ? @server_side_encryption_algorithm : nil,
|
312
317
|
:ssekms_key_id => @server_side_encryption_algorithm == "aws:kms" ? @ssekms_key_id : nil,
|
313
318
|
:storage_class => @storage_class,
|
314
|
-
:content_encoding => @encoding ==
|
319
|
+
:content_encoding => @encoding == GZIP_ENCODING ? GZIP_ENCODING : nil,
|
315
320
|
:multipart_threshold => @upload_multipart_threshold
|
316
321
|
}
|
317
322
|
end
|
@@ -347,10 +352,10 @@ class LogStash::Outputs::S3 < LogStash::Outputs::Base
|
|
347
352
|
temp_file = factory.current
|
348
353
|
|
349
354
|
if @rotation.rotate?(temp_file)
|
350
|
-
@logger.debug("Rotate file",
|
351
|
-
|
352
|
-
|
353
|
-
|
355
|
+
@logger.debug? && @logger.debug("Rotate file",
|
356
|
+
:key => temp_file.key,
|
357
|
+
:path => temp_file.path,
|
358
|
+
:strategy => @rotation.class.name)
|
354
359
|
|
355
360
|
upload_file(temp_file)
|
356
361
|
factory.rotate!
|
@@ -360,7 +365,7 @@ class LogStash::Outputs::S3 < LogStash::Outputs::Base
|
|
360
365
|
end
|
361
366
|
|
362
367
|
def upload_file(temp_file)
|
363
|
-
@logger.debug("Queue for upload", :path => temp_file.path)
|
368
|
+
@logger.debug? && @logger.debug("Queue for upload", :path => temp_file.path)
|
364
369
|
|
365
370
|
# if the queue is full the calling thread will be used to upload
|
366
371
|
temp_file.close # make sure the content is on disk
|
@@ -383,7 +388,7 @@ class LogStash::Outputs::S3 < LogStash::Outputs::Base
|
|
383
388
|
end
|
384
389
|
|
385
390
|
def clean_temporary_file(file)
|
386
|
-
@logger.debug("Removing temporary file", :
|
391
|
+
@logger.debug? && @logger.debug("Removing temporary file", :path => file.path)
|
387
392
|
file.delete!
|
388
393
|
end
|
389
394
|
|
@@ -393,16 +398,48 @@ class LogStash::Outputs::S3 < LogStash::Outputs::Base
|
|
393
398
|
@crash_uploader = Uploader.new(bucket_resource, @logger, CRASH_RECOVERY_THREADPOOL)
|
394
399
|
|
395
400
|
temp_folder_path = Pathname.new(@temporary_directory)
|
396
|
-
Dir.glob(::File.join(@temporary_directory, "**/*"))
|
397
|
-
|
398
|
-
|
399
|
-
|
400
|
-
|
401
|
-
|
402
|
-
|
401
|
+
files = Dir.glob(::File.join(@temporary_directory, "**/*"))
|
402
|
+
.select { |file_path| ::File.file?(file_path) }
|
403
|
+
under_recovery_files = get_under_recovery_files(files)
|
404
|
+
|
405
|
+
files.each do |file_path|
|
406
|
+
# when encoding is GZIP, if file is already recovering or recovered and uploading to S3, log and skip
|
407
|
+
if under_recovery_files.include?(file_path)
|
408
|
+
unless file_path.include?(TemporaryFile.gzip_extension)
|
409
|
+
@logger.warn("The #{file_path} file either under recover process or failed to recover before.")
|
410
|
+
end
|
403
411
|
else
|
404
|
-
|
412
|
+
temp_file = TemporaryFile.create_from_existing_file(file_path, temp_folder_path)
|
413
|
+
# do not remove or upload if Logstash tries to recover file but fails
|
414
|
+
if temp_file.recoverable?
|
415
|
+
if temp_file.size > 0
|
416
|
+
@logger.debug? && @logger.debug("Recovering from crash and uploading", :path => temp_file.path)
|
417
|
+
@crash_uploader.upload_async(temp_file,
|
418
|
+
:on_complete => method(:clean_temporary_file),
|
419
|
+
:upload_options => upload_options)
|
420
|
+
else
|
421
|
+
clean_temporary_file(temp_file)
|
422
|
+
end
|
423
|
+
end
|
424
|
+
end
|
425
|
+
end
|
426
|
+
end
|
427
|
+
|
428
|
+
# figures out the recovering files and
|
429
|
+
# creates a skip list to ignore for the rest of processes
|
430
|
+
def get_under_recovery_files(files)
|
431
|
+
skip_files = Set.new
|
432
|
+
return skip_files unless @encoding == GZIP_ENCODING
|
433
|
+
|
434
|
+
files.each do |file_path|
|
435
|
+
if file_path.include?(TemporaryFile.recovery_file_name_tag)
|
436
|
+
skip_files << file_path
|
437
|
+
if file_path.include?(TemporaryFile.gzip_extension)
|
438
|
+
# also include the original corrupted gzip file
|
439
|
+
skip_files << file_path.gsub(TemporaryFile.recovery_file_name_tag, "")
|
440
|
+
end
|
405
441
|
end
|
406
442
|
end
|
443
|
+
skip_files
|
407
444
|
end
|
408
445
|
end
|
@@ -0,0 +1,15 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
require "jars/installer"
|
3
|
+
require "fileutils"
|
4
|
+
|
5
|
+
task :vendor do
|
6
|
+
exit(1) unless system './gradlew vendor'
|
7
|
+
version = File.read("VERSION").strip
|
8
|
+
end
|
9
|
+
|
10
|
+
desc "clean"
|
11
|
+
task :clean do
|
12
|
+
["build", "vendor/jar-dependencies", "Gemfile.lock"].each do |p|
|
13
|
+
FileUtils.rm_rf(p)
|
14
|
+
end
|
15
|
+
end
|
data/logstash-output-s3.gemspec
CHANGED
@@ -1,13 +1,13 @@
|
|
1
1
|
Gem::Specification.new do |s|
|
2
2
|
s.name = 'logstash-output-s3'
|
3
|
-
s.version = '4.
|
3
|
+
s.version = '4.4.0'
|
4
4
|
s.licenses = ['Apache-2.0']
|
5
5
|
s.summary = "Sends Logstash events to the Amazon Simple Storage Service"
|
6
6
|
s.description = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program"
|
7
7
|
s.authors = ["Elastic"]
|
8
8
|
s.email = 'info@elastic.co'
|
9
9
|
s.homepage = "http://www.elastic.co/guide/en/logstash/current/index.html"
|
10
|
-
s.require_paths = ["lib"]
|
10
|
+
s.require_paths = ["lib", "vendor/jar-dependencies"]
|
11
11
|
|
12
12
|
# Files
|
13
13
|
s.files = Dir["lib/**/*","spec/**/*","*.gemspec","*.md","CONTRIBUTORS","Gemfile","LICENSE","NOTICE.TXT", "vendor/jar-dependencies/**/*.jar", "vendor/jar-dependencies/**/*.rb", "VERSION", "docs/**/*"]
|
@@ -7,18 +7,17 @@ require "stud/temporary"
|
|
7
7
|
describe "Restore from crash", :integration => true do
|
8
8
|
include_context "setup plugin"
|
9
9
|
|
10
|
-
let(:options) { main_options.merge({ "restore" => true, "canned_acl" => "public-read-write" }) }
|
11
|
-
|
12
10
|
let(:number_of_files) { 5 }
|
13
11
|
let(:dummy_content) { "foobar\n" * 100 }
|
14
|
-
let(:factory) { LogStash::Outputs::S3::TemporaryFileFactory.new(prefix, tags, "none", temporary_directory)}
|
15
12
|
|
16
13
|
before do
|
17
14
|
clean_remote_files(prefix)
|
18
15
|
end
|
19
16
|
|
20
|
-
|
21
17
|
context 'with a non-empty tempfile' do
|
18
|
+
let(:options) { main_options.merge({ "restore" => true, "canned_acl" => "public-read-write" }) }
|
19
|
+
let(:factory) { LogStash::Outputs::S3::TemporaryFileFactory.new(prefix, tags, "none", temporary_directory)}
|
20
|
+
|
22
21
|
before do
|
23
22
|
# Creating a factory always create a file
|
24
23
|
factory.current.write(dummy_content)
|
@@ -41,6 +40,9 @@ describe "Restore from crash", :integration => true do
|
|
41
40
|
end
|
42
41
|
|
43
42
|
context 'with an empty tempfile' do
|
43
|
+
let(:options) { main_options.merge({ "restore" => true, "canned_acl" => "public-read-write" }) }
|
44
|
+
let(:factory) { LogStash::Outputs::S3::TemporaryFileFactory.new(prefix, tags, "none", temporary_directory)}
|
45
|
+
|
44
46
|
before do
|
45
47
|
factory.current
|
46
48
|
factory.rotate!
|
@@ -63,5 +65,68 @@ describe "Restore from crash", :integration => true do
|
|
63
65
|
expect(bucket_resource.objects(:prefix => prefix).count).to eq(0)
|
64
66
|
end
|
65
67
|
end
|
68
|
+
|
69
|
+
context "#gzip encoding" do
|
70
|
+
let(:options) { main_options.merge({ "restore" => true, "canned_acl" => "public-read-write", "encoding" => "gzip" }) }
|
71
|
+
let(:factory) { LogStash::Outputs::S3::TemporaryFileFactory.new(prefix, tags, "gzip", temporary_directory)}
|
72
|
+
describe "with empty recovered file" do
|
73
|
+
before do
|
74
|
+
# Creating a factory always create a file
|
75
|
+
factory.current.write('')
|
76
|
+
factory.current.fsync
|
77
|
+
factory.current.close
|
78
|
+
end
|
79
|
+
|
80
|
+
it 'should not upload and not remove temp file' do
|
81
|
+
subject.register
|
82
|
+
try(20) do
|
83
|
+
expect(bucket_resource.objects(:prefix => prefix).count).to eq(0)
|
84
|
+
expect(Dir.glob(File.join(temporary_directory, "*")).size).to eq(1)
|
85
|
+
end
|
86
|
+
end
|
87
|
+
end
|
88
|
+
|
89
|
+
describe "with healthy recovered, size is greater than zero file" do
|
90
|
+
before do
|
91
|
+
# Creating a factory always create a file
|
92
|
+
factory.current.write(dummy_content)
|
93
|
+
factory.current.fsync
|
94
|
+
factory.current.close
|
95
|
+
|
96
|
+
(number_of_files - 1).times do
|
97
|
+
factory.rotate!
|
98
|
+
factory.current.write(dummy_content)
|
99
|
+
factory.current.fsync
|
100
|
+
factory.current.close
|
101
|
+
end
|
102
|
+
end
|
103
|
+
|
104
|
+
it 'should recover, upload to S3 and remove temp file' do
|
105
|
+
subject.register
|
106
|
+
try(20) do
|
107
|
+
expect(bucket_resource.objects(:prefix => prefix).count).to eq(number_of_files)
|
108
|
+
expect(Dir.glob(File.join(temporary_directory, "*")).size).to eq(0)
|
109
|
+
expect(bucket_resource.objects(:prefix => prefix).first.acl.grants.collect(&:permission)).to include("READ", "WRITE")
|
110
|
+
end
|
111
|
+
end
|
112
|
+
end
|
113
|
+
|
114
|
+
describe "with failure when recovering" do
|
115
|
+
before do
|
116
|
+
# Creating a factory always create a file
|
117
|
+
factory.current.write(dummy_content)
|
118
|
+
factory.current.fsync
|
119
|
+
end
|
120
|
+
|
121
|
+
it 'should not upload to S3 and not remove temp file' do
|
122
|
+
subject.register
|
123
|
+
try(20) do
|
124
|
+
expect(bucket_resource.objects(:prefix => prefix).count).to eq(0)
|
125
|
+
expect(Dir.glob(File.join(temporary_directory, "*")).size).to eq(1)
|
126
|
+
end
|
127
|
+
end
|
128
|
+
end
|
129
|
+
end
|
130
|
+
|
66
131
|
end
|
67
132
|
|
@@ -88,8 +88,8 @@ describe LogStash::Outputs::S3::FileRepository do
|
|
88
88
|
|
89
89
|
it "returns all available keys" do
|
90
90
|
subject.get_file(prefix_key) { |file| file.write("something") }
|
91
|
-
expect(subject.keys
|
92
|
-
expect(subject.keys.
|
91
|
+
expect(subject.keys).to include(prefix_key)
|
92
|
+
expect(subject.keys.size).to eq(1)
|
93
93
|
end
|
94
94
|
|
95
95
|
it "clean stale factories" do
|
@@ -105,9 +105,14 @@ describe LogStash::Outputs::S3::FileRepository do
|
|
105
105
|
|
106
106
|
@file_repository.get_file("another-prefix") { |file| file.write("hello") }
|
107
107
|
expect(@file_repository.size).to eq(2)
|
108
|
+
sleep 1.2 # allow sweeper to kick in
|
108
109
|
try(10) { expect(@file_repository.size).to eq(1) }
|
109
110
|
expect(File.directory?(path)).to be_falsey
|
111
|
+
|
112
|
+
sleep 1.5 # allow sweeper to kick in, again
|
113
|
+
expect(@file_repository.size).to eq(1)
|
110
114
|
end
|
115
|
+
|
111
116
|
end
|
112
117
|
|
113
118
|
|
@@ -25,11 +25,11 @@ describe LogStash::Outputs::S3::SizeRotationPolicy do
|
|
25
25
|
end
|
26
26
|
|
27
27
|
it "raises an exception if the `size_file` is 0" do
|
28
|
-
expect { described_class.new(0) }.to raise_error(LogStash::ConfigurationError, /need to be
|
28
|
+
expect { described_class.new(0) }.to raise_error(LogStash::ConfigurationError, /need to be greater than 0/)
|
29
29
|
end
|
30
30
|
|
31
31
|
it "raises an exception if the `size_file` is < 0" do
|
32
|
-
expect { described_class.new(-100) }.to raise_error(LogStash::ConfigurationError, /need to be
|
32
|
+
expect { described_class.new(-100) }.to raise_error(LogStash::ConfigurationError, /need to be greater than 0/)
|
33
33
|
end
|
34
34
|
|
35
35
|
context "#needs_periodic?" do
|
data/spec/supports/helpers.rb
CHANGED
@@ -5,6 +5,7 @@ shared_context "setup plugin" do
|
|
5
5
|
let(:bucket) { ENV["AWS_LOGSTASH_TEST_BUCKET"] }
|
6
6
|
let(:access_key_id) { ENV["AWS_ACCESS_KEY_ID"] }
|
7
7
|
let(:secret_access_key) { ENV["AWS_SECRET_ACCESS_KEY"] }
|
8
|
+
let(:session_token) { ENV["AWS_SESSION_TOKEN"] }
|
8
9
|
let(:size_file) { 100 }
|
9
10
|
let(:time_file) { 100 }
|
10
11
|
let(:tags) { [] }
|
@@ -18,6 +19,7 @@ shared_context "setup plugin" do
|
|
18
19
|
"temporary_directory" => temporary_directory,
|
19
20
|
"access_key_id" => access_key_id,
|
20
21
|
"secret_access_key" => secret_access_key,
|
22
|
+
"session_token" => session_token,
|
21
23
|
"size_file" => size_file,
|
22
24
|
"time_file" => time_file,
|
23
25
|
"region" => region,
|
@@ -25,7 +27,7 @@ shared_context "setup plugin" do
|
|
25
27
|
}
|
26
28
|
end
|
27
29
|
|
28
|
-
let(:client_credentials) { Aws::Credentials.new(access_key_id, secret_access_key) }
|
30
|
+
let(:client_credentials) { Aws::Credentials.new(access_key_id, secret_access_key, session_token) }
|
29
31
|
let(:bucket_resource) { Aws::S3::Bucket.new(bucket, { :credentials => client_credentials, :region => region }) }
|
30
32
|
|
31
33
|
subject { LogStash::Outputs::S3.new(options) }
|
Binary file
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: logstash-output-s3
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 4.
|
4
|
+
version: 4.4.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Elastic
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2022-07-19 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
requirement: !ruby/object:Gem::Requirement
|
@@ -129,7 +129,9 @@ files:
|
|
129
129
|
- LICENSE
|
130
130
|
- NOTICE.TXT
|
131
131
|
- README.md
|
132
|
+
- VERSION
|
132
133
|
- docs/index.asciidoc
|
134
|
+
- lib/logstash-output-s3_jars.rb
|
133
135
|
- lib/logstash/outputs/s3.rb
|
134
136
|
- lib/logstash/outputs/s3/file_repository.rb
|
135
137
|
- lib/logstash/outputs/s3/patch.rb
|
@@ -142,6 +144,7 @@ files:
|
|
142
144
|
- lib/logstash/outputs/s3/uploader.rb
|
143
145
|
- lib/logstash/outputs/s3/writable_directory_validator.rb
|
144
146
|
- lib/logstash/outputs/s3/write_bucket_permission_validator.rb
|
147
|
+
- lib/tasks/build.rake
|
145
148
|
- logstash-output-s3.gemspec
|
146
149
|
- spec/integration/dynamic_prefix_spec.rb
|
147
150
|
- spec/integration/gzip_file_spec.rb
|
@@ -164,6 +167,7 @@ files:
|
|
164
167
|
- spec/outputs/s3_spec.rb
|
165
168
|
- spec/spec_helper.rb
|
166
169
|
- spec/supports/helpers.rb
|
170
|
+
- vendor/jar-dependencies/org/logstash/plugins/outputs/s3/logstash-output-s3/4.4.0/logstash-output-s3-4.4.0.jar
|
167
171
|
homepage: http://www.elastic.co/guide/en/logstash/current/index.html
|
168
172
|
licenses:
|
169
173
|
- Apache-2.0
|
@@ -174,6 +178,7 @@ post_install_message:
|
|
174
178
|
rdoc_options: []
|
175
179
|
require_paths:
|
176
180
|
- lib
|
181
|
+
- vendor/jar-dependencies
|
177
182
|
required_ruby_version: !ruby/object:Gem::Requirement
|
178
183
|
requirements:
|
179
184
|
- - ">="
|