fluent-plugin-cloudfront-log-optimized 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +42 -0
- data/CHANGELOG.md +23 -0
- data/Gemfile +4 -0
- data/README.md +101 -0
- data/Rakefile +12 -0
- data/fluent-plugin-cloudfront-log-optimized.gemspec +26 -0
- data/lib/fluent/plugin/enumerable_inflater.rb +68 -0
- data/lib/fluent/plugin/in_cloudfront_log.rb +216 -0
- data/test/helper.rb +28 -0
- data/test/plugin/test_in_cloudfrontlog.rb +108 -0
- metadata +149 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: 41c311f36c1f68872f1ad935ea398c3f21b0eb2154ed2ded25934f2e70966ab6
|
4
|
+
data.tar.gz: 0a6997034d17b4b5c02c66ac3dd8017cc842b40b3025bdcbb18c68719641536c
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 87aec8f7754aefe0a81b47a7beb255d6b1598213a73623117cad56206e5f7b263ac3118963db64e65b1906f947ab448fe2d3e70448d14a74f5aeb6e155b8fd13
|
7
|
+
data.tar.gz: f77b92a998259fbf4b3c778b4032c87e3860b37355e10631cc01dce3c099f89bf094cab113c7e9f62875d8701b1910c17b289236addd726245eed3b6c7c7c905
|
data/.gitignore
ADDED
@@ -0,0 +1,42 @@
|
|
1
|
+
# Created by https://www.gitignore.io/api/ruby
|
2
|
+
|
3
|
+
### Ruby ###
|
4
|
+
.bin
|
5
|
+
*.gems
|
6
|
+
*.gem
|
7
|
+
*.rbc
|
8
|
+
/.config
|
9
|
+
/coverage/
|
10
|
+
/InstalledFiles
|
11
|
+
/pkg/
|
12
|
+
/spec/reports/
|
13
|
+
/spec/examples.txt
|
14
|
+
/test/tmp/
|
15
|
+
/test/version_tmp/
|
16
|
+
/tmp/
|
17
|
+
|
18
|
+
## Specific to RubyMotion:
|
19
|
+
.dat*
|
20
|
+
.repl_history
|
21
|
+
build/
|
22
|
+
|
23
|
+
## Documentation cache and generated files:
|
24
|
+
/.yardoc/
|
25
|
+
/_yardoc/
|
26
|
+
/doc/
|
27
|
+
/rdoc/
|
28
|
+
|
29
|
+
## Environment normalization:
|
30
|
+
/.bundle/
|
31
|
+
/vendor/bundle
|
32
|
+
/lib/bundler/man/
|
33
|
+
|
34
|
+
# for a library or gem, you might want to ignore these files since the code is
|
35
|
+
# intended to run in multiple environments; otherwise, check them in:
|
36
|
+
Gemfile.lock
|
37
|
+
.ruby-version
|
38
|
+
.ruby-gemset
|
39
|
+
|
40
|
+
# unless supporting rvm < 1.11.0 or doing something fancy, ignore this:
|
41
|
+
.rvmrc
|
42
|
+
|
data/CHANGELOG.md
ADDED
@@ -0,0 +1,23 @@
|
|
1
|
+
## Release 0.2.0 (kjwierenga)
|
2
|
+
Merge branch 'feature/enh/rename-to-optimized' into develop
|
3
|
+
Merge branch 'feature/fix/process-log-files-in-last-modified-order' into develop
|
4
|
+
- [fix] process log files in order of ascending last_modified time
|
5
|
+
Merge branch 'feature/enh/reduce-memory-usage' into develop
|
6
|
+
- [enh] process log files with (small) constant memory usage even for very large log files
|
7
|
+
- [add] EnumerableInflater so we can inflate the .gz file in a streaming fashion (greatly reducing memory usage)
|
8
|
+
- [enh] transpose.to_h is faster than Hash[....zip(line)]
|
9
|
+
Merge branch 'feature/fix/unescape-once' into develop
|
10
|
+
- [enh] URI.unescape is deprecated and CGI.unescape is faster
|
11
|
+
- [fix] unescape once; CloudFront encodes once so we should unescape once
|
12
|
+
Merge branch 'feature/enh/optional-parse-date-time' into develop
|
13
|
+
- [enh] use Time.iso8601 which is much faster than the Time.parse
|
14
|
+
- [add] option to make parsing of date time optional
|
15
|
+
Merge branch 'feature/enh/modernize-tests' into develop
|
16
|
+
Merge branch 'feature/fix/verbose-param' into develop
|
17
|
+
- [fix] verbose flag; it was set to string causing it to be always enabled
|
18
|
+
|
19
|
+
## Release 0.1.1 (packetloop)
|
20
|
+
- Packetloop 2019 version
|
21
|
+
|
22
|
+
## Release 0.0.5 (kubihie)
|
23
|
+
- Original 2016 version
|
data/Gemfile
ADDED
data/README.md
ADDED
@@ -0,0 +1,101 @@
|
|
1
|
+
# Fluent::Plugin::Cloudfront::Log
|
2
|
+
This plugin will connect to the S3 bucket that you store your cloudfront logs in. Once the plugin processes them and ships them to FluentD, it moves them to another location (either another bucket or sub directory).
|
3
|
+
|
4
|
+
## Lineage
|
5
|
+
This is a fork of [packetloop's v0.14 fix](https://github.com/packetloop/fluent-plugin-cloudfront-log-v0.14-fix)
|
6
|
+
which is a fork of the original [kubihie version](https://github.com/kubihie/fluent-plugin-cloudfront-log)
|
7
|
+
with contributions from [lenfree's version](https://github.com/lenfree/fluent-plugin-cloudfront-log).
|
8
|
+
This fork has optimizations to process hundreds of large CloudFront log files (tens of MB)
|
9
|
+
efficiently and with constant memory usage.
|
10
|
+
|
11
|
+
I will publish this gem so it can be used in production assuming upstream
|
12
|
+
repositories are unmaintained. I would be happy to merge these changes back into [kubihie's version](https://github.com/kubihie/fluent-plugin-cloudfront-log).
|
13
|
+
## Example config
|
14
|
+
```
|
15
|
+
<source>
|
16
|
+
@type cloudfront_log
|
17
|
+
log_bucket cloudfront-logs
|
18
|
+
log_prefix production
|
19
|
+
region us-east-1
|
20
|
+
interval 300
|
21
|
+
aws_key_id xxxxxx
|
22
|
+
aws_sec_key xxxxxx
|
23
|
+
tag reverb.cloudfront
|
24
|
+
verbose true
|
25
|
+
</source>
|
26
|
+
```
|
27
|
+
|
28
|
+
## Configuration options
|
29
|
+
|
30
|
+
#### log_bucket
|
31
|
+
This option tells the plugin where to look for the cloudfront logs
|
32
|
+
|
33
|
+
#### log_prefix
|
34
|
+
For example if your logs are stored in a folder called "production" under the "cloudfront-logs" bucket, your logs would be stored in cloudfront like "cloudfront-logs/production/log.gz".
|
35
|
+
In this case, you'd want to use the prefix "production".
|
36
|
+
|
37
|
+
#### moved_log_bucket
|
38
|
+
Here you can specify where you'd like the log files to be moved after processing. If left blank this defaults to a folder called `_moved` under the bucket configured for `@log_bucket`.
|
39
|
+
|
40
|
+
#### moved_log_prefix
|
41
|
+
This specifices what the log files will be named once they're processed. This defaults to `_moved`.
|
42
|
+
|
43
|
+
#### region
|
44
|
+
The region where your cloudfront logs are stored.
|
45
|
+
|
46
|
+
#### interval
|
47
|
+
This is the rate in seconds at which we check the bucket for updated logs. This defaults to 300.
|
48
|
+
#### aws_sec_id
|
49
|
+
The ID of your AWS keypair. Note: Since this plugin uses aws-sdk under the hood you can leave these two aws fields blank if you have an IAM role applied to your FluentD instance.
|
50
|
+
|
51
|
+
#### aws_sec_key
|
52
|
+
The secret key portion of your AWS keypair
|
53
|
+
|
54
|
+
#### tag
|
55
|
+
This is a FluentD builtin.
|
56
|
+
|
57
|
+
#### thread_num
|
58
|
+
The number of threads to create to concurrently process the S3 objects. Defaults to 4.
|
59
|
+
|
60
|
+
#### s3_get_max
|
61
|
+
Control the size of the S3 fetched list on each iteration. Default to 200.
|
62
|
+
|
63
|
+
#### delimiter
|
64
|
+
You shouldn't have to specify delimiter at all but this option is provided and passed to the S3 client in the event that you have a weird delimiter in your log file names. Defaults to `nil`.
|
65
|
+
|
66
|
+
#### verbose
|
67
|
+
Turn this on if you'd like to see verbose information about the plugin and how it's processing your files.
|
68
|
+
|
69
|
+
### parse_date_time
|
70
|
+
Turn this off when you don't want the date and time to be parsed into the timestamp for the record.
|
71
|
+
Used when timestamp parsing can be implemented faster downstream. Default is true.
|
72
|
+
|
73
|
+
## Installation
|
74
|
+
|
75
|
+
Add this line to your application's Gemfile:
|
76
|
+
|
77
|
+
```ruby
|
78
|
+
gem 'fluent-plugin-cloudfront-log-optimized'
|
79
|
+
```
|
80
|
+
|
81
|
+
And then execute:
|
82
|
+
|
83
|
+
$ bundle
|
84
|
+
|
85
|
+
Or install it yourself as:
|
86
|
+
|
87
|
+
$ gem install 'fluent-plugin-cloudfront-log-optimized'
|
88
|
+
|
89
|
+
## Development
|
90
|
+
|
91
|
+
After checking out the repo, run `bin/setup` to install dependencies. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
|
92
|
+
|
93
|
+
To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
|
94
|
+
|
95
|
+
## Contributing
|
96
|
+
|
97
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/packetloop/fluent-plugin-cloudfront-log-optimized.
|
98
|
+
|
99
|
+
## Credits
|
100
|
+
|
101
|
+
kubihie
|
data/Rakefile
ADDED
@@ -0,0 +1,26 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
lib = File.expand_path('../lib', __FILE__)
|
3
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
4
|
+
|
5
|
+
Gem::Specification.new do |spec|
|
6
|
+
spec.name = "fluent-plugin-cloudfront-log-optimized"
|
7
|
+
spec.version = "0.2.0"
|
8
|
+
spec.authors = ["kubihee", "lenfree", "kjwierenga"]
|
9
|
+
spec.email = ["kubihie@gmail.com", "lenfree.yeung@gmail.com", "k.j.wierenga@gmail.com"]
|
10
|
+
|
11
|
+
spec.summary = %q{AWS CloudFront log input plugin optimized for large log files. Credit to kubihie and lenfree.}
|
12
|
+
spec.description = %q{AWS CloudFront log input plugin for fluentd. Upstream appears to be unmaintained.}
|
13
|
+
spec.homepage = "https://github.com/kjwierenga/fluent-plugin-cloudfront-log-optimized"
|
14
|
+
|
15
|
+
spec.files = `git ls-files`.split($/)
|
16
|
+
spec.executables = spec.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
|
17
|
+
spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
|
18
|
+
spec.require_paths = ["lib"]
|
19
|
+
|
20
|
+
spec.add_dependency "fluentd", ">= 0.14.0", "< 2"
|
21
|
+
spec.add_dependency "aws-sdk-s3", "~> 1"
|
22
|
+
spec.add_dependency "aws-sdk-sqs", "~> 1"
|
23
|
+
spec.add_development_dependency "bundler", "~> 1.7"
|
24
|
+
spec.add_development_dependency "rake", "~> 12"
|
25
|
+
spec.add_development_dependency 'test-unit', "~> 2"
|
26
|
+
end
|
@@ -0,0 +1,68 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
require 'zlib'
|
3
|
+
|
4
|
+
class EnumerableInflater
|
5
|
+
|
6
|
+
CHUNKSIZE = 1024**2
|
7
|
+
|
8
|
+
def initialize(options = {})
|
9
|
+
@io = options[:io]
|
10
|
+
end
|
11
|
+
|
12
|
+
def lines
|
13
|
+
Enumerator.new { |main_enum| stream_lines(main_enum) }
|
14
|
+
end
|
15
|
+
|
16
|
+
private
|
17
|
+
|
18
|
+
attr_reader :io, :inflater
|
19
|
+
|
20
|
+
def stream_lines(main_enum)
|
21
|
+
init_gzip_inflater
|
22
|
+
|
23
|
+
split_lines.lazy.each { |line| main_enum << line }
|
24
|
+
ensure
|
25
|
+
inflater&.close
|
26
|
+
end
|
27
|
+
|
28
|
+
def split_lines
|
29
|
+
buffer = ""
|
30
|
+
|
31
|
+
Enumerator.new do |yielder|
|
32
|
+
ungzip.each do |decompressed_chunk|
|
33
|
+
buffer += decompressed_chunk
|
34
|
+
new_buffer = ""
|
35
|
+
buffer.each_line do |l|
|
36
|
+
l.end_with?("\n") ? yielder << l : new_buffer += l
|
37
|
+
end
|
38
|
+
|
39
|
+
buffer = new_buffer
|
40
|
+
end
|
41
|
+
end
|
42
|
+
end
|
43
|
+
|
44
|
+
def ungzip
|
45
|
+
Enumerator.new do |yielder|
|
46
|
+
stream_file.each do |compressed|
|
47
|
+
inflater.inflate(compressed) do |decompressed_chunk|
|
48
|
+
yielder << decompressed_chunk
|
49
|
+
end
|
50
|
+
end
|
51
|
+
end
|
52
|
+
end
|
53
|
+
|
54
|
+
def stream_file
|
55
|
+
Enumerator.new do |stream_enum|
|
56
|
+
io.each(nil, CHUNKSIZE) do |chunk|
|
57
|
+
stream_enum << chunk
|
58
|
+
end
|
59
|
+
end
|
60
|
+
end
|
61
|
+
|
62
|
+
def init_gzip_inflater
|
63
|
+
# Taken from examples in:
|
64
|
+
# https://docs.ruby-lang.org/en/2.0.0/Zlib/Inflate.html
|
65
|
+
@inflater = Zlib::Inflate.new(Zlib::MAX_WBITS + 32)
|
66
|
+
end
|
67
|
+
|
68
|
+
end
|
@@ -0,0 +1,216 @@
|
|
1
|
+
require 'fluent/input'
|
2
|
+
require 'fluent/plugin/enumerable_inflater'
|
3
|
+
require 'fileutils'
|
4
|
+
|
5
|
+
class Fluent::Cloudfront_LogInput < Fluent::Input
|
6
|
+
Fluent::Plugin.register_input('cloudfront_log', self)
|
7
|
+
|
8
|
+
config_param :aws_key_id, :string, :default => nil, :secret => true
|
9
|
+
config_param :aws_sec_key, :string, :default => nil, :secret => true
|
10
|
+
config_param :log_bucket, :string
|
11
|
+
config_param :log_prefix, :string
|
12
|
+
config_param :moved_log_bucket, :string, :default => nil
|
13
|
+
config_param :moved_log_prefix, :string, :default => '_moved'
|
14
|
+
config_param :region, :string
|
15
|
+
|
16
|
+
config_param :tag, :string, :default => 'cloudfront.access'
|
17
|
+
config_param :interval, :integer, :default => 300
|
18
|
+
config_param :delimiter, :string, :default => nil
|
19
|
+
config_param :verbose, :bool, :default => false
|
20
|
+
config_param :thread_num, :integer, :default => 4
|
21
|
+
config_param :s3_get_max, :integer, :default => 200
|
22
|
+
|
23
|
+
config_param :parse_date_time, :bool, :default => true
|
24
|
+
|
25
|
+
def initialize
|
26
|
+
super
|
27
|
+
require 'logger'
|
28
|
+
require 'zlib'
|
29
|
+
require 'aws-sdk-s3'
|
30
|
+
require 'time'
|
31
|
+
require 'uri'
|
32
|
+
end
|
33
|
+
|
34
|
+
def configure(conf)
|
35
|
+
super
|
36
|
+
|
37
|
+
raise Fluent::ConfigError.new unless @region
|
38
|
+
raise Fluent::ConfigError.new unless @log_bucket
|
39
|
+
raise Fluent::ConfigError.new unless @log_prefix
|
40
|
+
|
41
|
+
@moved_log_bucket = @log_bucket unless @moved_log_bucket
|
42
|
+
@moved_log_prefix = @log_prefix + '_moved' unless @moved_log_prefix
|
43
|
+
|
44
|
+
if @verbose
|
45
|
+
log.info("@log_bucket: #{@log_bucket}")
|
46
|
+
log.info("@moved_log_bucket: #{@moved_log_bucket}")
|
47
|
+
log.info("@log_prefix: #{@log_prefix}")
|
48
|
+
log.info("@moved_log_prefix: #{@moved_log_prefix}")
|
49
|
+
log.info("@thread_num: #{@thread_num}")
|
50
|
+
log.info("@parse_date_time: #{@parse_date_time}")
|
51
|
+
end
|
52
|
+
end
|
53
|
+
|
54
|
+
def start
|
55
|
+
super
|
56
|
+
log.info("Cloudfront verbose logging enabled") if @verbose
|
57
|
+
client
|
58
|
+
|
59
|
+
@tmp_dir = File.join(plugin_root_dir || '/', 'tmp')
|
60
|
+
FileUtils.mkdir_p @tmp_dir
|
61
|
+
|
62
|
+
@loop = Coolio::Loop.new
|
63
|
+
timer = TimerWatcher.new(@interval, true, log, &method(:input))
|
64
|
+
|
65
|
+
@loop.attach(timer)
|
66
|
+
@thread = Thread.new(&method(:run))
|
67
|
+
end
|
68
|
+
|
69
|
+
def shutdown
|
70
|
+
@loop.stop
|
71
|
+
@thread.join
|
72
|
+
end
|
73
|
+
|
74
|
+
def run
|
75
|
+
@loop.run
|
76
|
+
end
|
77
|
+
|
78
|
+
def client
|
79
|
+
begin
|
80
|
+
options = {:region => @region}
|
81
|
+
if @aws_key_id and @aws_sec_key
|
82
|
+
options[:access_key_id] = @aws_key_id
|
83
|
+
options[:secret_access_key] = @aws_sec_key
|
84
|
+
end
|
85
|
+
@client = Aws::S3::Client.new(options)
|
86
|
+
rescue => e
|
87
|
+
log.warn("S3 client error. #{e.message}")
|
88
|
+
end
|
89
|
+
end
|
90
|
+
|
91
|
+
def parse_header(line)
|
92
|
+
case line
|
93
|
+
when /^#Version:.+/i then
|
94
|
+
@version = line.sub(/^#Version:/i, '').strip
|
95
|
+
when /^#Fields:.+/i then
|
96
|
+
@fields = line.sub(/^#Fields:/i, '').strip.split("\s")
|
97
|
+
end
|
98
|
+
end
|
99
|
+
|
100
|
+
def purge(filename)
|
101
|
+
# Key is the name of the object without the bucket prefix, e.g: asdf/asdf.jpg
|
102
|
+
source_object_key = [@log_prefix, filename].join('/')
|
103
|
+
|
104
|
+
# Full path includes bucket name in addition to object key, e.g: bucket/asdf/asdf.jpg
|
105
|
+
source_object_full_path = [@log_bucket, source_object_key].join('/')
|
106
|
+
|
107
|
+
dest_object_key = [@moved_log_prefix, filename].join('/')
|
108
|
+
dest_object_full_path = [@moved_log_bucket, dest_object_key].join('/')
|
109
|
+
|
110
|
+
log.info("Copying object: #{source_object_full_path} to #{dest_object_full_path}") if @verbose
|
111
|
+
|
112
|
+
begin
|
113
|
+
client.copy_object(:bucket => @moved_log_bucket, :copy_source => source_object_full_path, :key => dest_object_key)
|
114
|
+
rescue => e
|
115
|
+
log.warn("S3 Copy client error. #{e.message}")
|
116
|
+
return
|
117
|
+
end
|
118
|
+
|
119
|
+
|
120
|
+
log.info("Deleting object: #{source_object_key} from #{@log_bucket}") if @verbose
|
121
|
+
begin
|
122
|
+
client.delete_object(:bucket => @log_bucket, :key => source_object_key)
|
123
|
+
rescue => e
|
124
|
+
log.warn("S3 Delete client error. #{e.message}")
|
125
|
+
return
|
126
|
+
end
|
127
|
+
end
|
128
|
+
|
129
|
+
def process_line(line)
|
130
|
+
if line[0.1] == '#'
|
131
|
+
parse_header(line)
|
132
|
+
return
|
133
|
+
end
|
134
|
+
|
135
|
+
record = [
|
136
|
+
@fields,
|
137
|
+
CGI.unescape(line).strip.split("\t") # hoge%2520fuga -> hoge%20fuga
|
138
|
+
].transpose.to_h
|
139
|
+
|
140
|
+
timestamp = if @parse_date_time
|
141
|
+
Time.iso8601("#{record['date']}T#{record['time']}+00:00").to_i
|
142
|
+
else
|
143
|
+
Time.now.to_i
|
144
|
+
end
|
145
|
+
|
146
|
+
router.emit(@tag, timestamp, record)
|
147
|
+
end
|
148
|
+
|
149
|
+
def process_content(content)
|
150
|
+
filename = content.key.sub(/^#{@log_prefix}\//, "")
|
151
|
+
log.info("CloudFront Currently processing: #{filename}") if @verbose
|
152
|
+
return if filename[-1] == '/' #skip directory/
|
153
|
+
return unless filename[-2, 2] == 'gz' #skip without gz file
|
154
|
+
|
155
|
+
tmp_file_name = File.join(@tmp_dir, content.key.split('/').last)
|
156
|
+
File.open(tmp_file_name, File::RDWR|File::CREAT, 0644) do |file|
|
157
|
+
# download file to local file system
|
158
|
+
client.get_object({bucket: @log_bucket, key: content.key}, target: file)
|
159
|
+
|
160
|
+
# inflate and process in chunks
|
161
|
+
file.rewind
|
162
|
+
EnumerableInflater.new(io: file).lines.each do |line|
|
163
|
+
process_line(line)
|
164
|
+
end
|
165
|
+
purge(filename)
|
166
|
+
rescue => e
|
167
|
+
log.warn("S3 GET client error. #{e.message}")
|
168
|
+
return
|
169
|
+
ensure
|
170
|
+
File.delete(file)
|
171
|
+
end
|
172
|
+
end
|
173
|
+
|
174
|
+
def input
|
175
|
+
log.info("CloudFront Begining input going to list S3")
|
176
|
+
begin
|
177
|
+
s3_list = client.list_objects_v2(:bucket => @log_bucket, :prefix => @log_prefix , :delimiter => @delimiter, :max_keys => @s3_get_max)
|
178
|
+
rescue => e
|
179
|
+
log.warn("S3 GET list error. #{e.message}")
|
180
|
+
return
|
181
|
+
end
|
182
|
+
log.info("Finished S3 get list")
|
183
|
+
queue = Queue.new
|
184
|
+
threads = []
|
185
|
+
log.debug("S3 List size: #{s3_list.contents.length}")
|
186
|
+
s3_list.contents.sort_by(&:last_modified).each do |content|
|
187
|
+
queue << content
|
188
|
+
end
|
189
|
+
# BEGINS THREADS
|
190
|
+
@thread_num.times do
|
191
|
+
threads << Thread.new do
|
192
|
+
until queue.empty?
|
193
|
+
work_unit = queue.pop(true) rescue nil
|
194
|
+
if work_unit
|
195
|
+
process_content(work_unit)
|
196
|
+
end
|
197
|
+
end
|
198
|
+
end
|
199
|
+
end
|
200
|
+
log.debug("CloudFront Waiting for Threads to finish...")
|
201
|
+
threads.each { |t| t.join }
|
202
|
+
log.debug("CloudFront Finished")
|
203
|
+
end
|
204
|
+
|
205
|
+
class TimerWatcher < Coolio::TimerWatcher
|
206
|
+
def initialize(interval, repeat, log, &callback)
|
207
|
+
@callback = callback
|
208
|
+
@log = log
|
209
|
+
super(interval, repeat)
|
210
|
+
end
|
211
|
+
|
212
|
+
def on_timer
|
213
|
+
@callback.call
|
214
|
+
end
|
215
|
+
end
|
216
|
+
end
|
data/test/helper.rb
ADDED
@@ -0,0 +1,28 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
require 'bundler'
|
3
|
+
begin
|
4
|
+
Bundler.setup(:default, :development)
|
5
|
+
rescue Bundler::BundlerError => e
|
6
|
+
$stderr.puts e.message
|
7
|
+
$stderr.puts "Run `bundle install` to install missing gems"
|
8
|
+
exit e.status_code
|
9
|
+
end
|
10
|
+
require 'test/unit'
|
11
|
+
|
12
|
+
$LOAD_PATH.unshift(File.join(File.dirname(__FILE__), '..', 'lib'))
|
13
|
+
$LOAD_PATH.unshift(File.dirname(__FILE__))
|
14
|
+
require 'fluent/test'
|
15
|
+
unless ENV.has_key?('VERBOSE')
|
16
|
+
nulllogger = Object.new
|
17
|
+
nulllogger.instance_eval {|obj|
|
18
|
+
def method_missing(method, *args)
|
19
|
+
# pass
|
20
|
+
end
|
21
|
+
}
|
22
|
+
$log = nulllogger
|
23
|
+
end
|
24
|
+
|
25
|
+
require 'fluent/plugin/in_cloudfront_log'
|
26
|
+
|
27
|
+
class Test::Unit::TestCase
|
28
|
+
end
|
@@ -0,0 +1,108 @@
|
|
1
|
+
require_relative '../helper'
|
2
|
+
require 'fluent/test'
|
3
|
+
require 'fluent/test/driver/input'
|
4
|
+
|
5
|
+
class Cloudfront_LogInputTest < Test::Unit::TestCase
|
6
|
+
setup do
|
7
|
+
Fluent::Test.setup
|
8
|
+
end
|
9
|
+
|
10
|
+
MINIMAL_CONFIG = %[
|
11
|
+
region ap-northeast-1
|
12
|
+
log_bucket bucket-name
|
13
|
+
log_prefix a/b/c
|
14
|
+
|
15
|
+
# aws_key_id AKIAZZZZZZZZZZZZZZZZ
|
16
|
+
# aws_sec_key 1234567890qwertyuiopasdfghjklzxcvbnm
|
17
|
+
# moved_log_bucket bucket-name-moved
|
18
|
+
# moved_log_prefix a/b/c_moved
|
19
|
+
# tag cloudfront
|
20
|
+
# interval 500
|
21
|
+
# verbose true
|
22
|
+
# thread_num 8
|
23
|
+
# parse_date_time false
|
24
|
+
]
|
25
|
+
|
26
|
+
def create_driver(conf = MINIMAL_CONFIG)
|
27
|
+
Fluent::Test::Driver::Input.new(Fluent::Cloudfront_LogInput).configure(conf)
|
28
|
+
end
|
29
|
+
|
30
|
+
test "create_driver doesn't raise error" do
|
31
|
+
assert_nothing_raised { create_driver }
|
32
|
+
end
|
33
|
+
|
34
|
+
sub_test_case "required parameters" do
|
35
|
+
test "region is required" do
|
36
|
+
exception = assert_raise(Fluent::ConfigError) {
|
37
|
+
create_driver(MINIMAL_CONFIG.gsub(/region.*$/, ''))
|
38
|
+
}
|
39
|
+
assert_equal("'region' parameter is required", exception.message)
|
40
|
+
end
|
41
|
+
|
42
|
+
test "log_bucket is required" do
|
43
|
+
exception = assert_raise(Fluent::ConfigError) {
|
44
|
+
create_driver(MINIMAL_CONFIG.gsub(/log_bucket.*$/, ''))
|
45
|
+
}
|
46
|
+
assert_equal("'log_bucket' parameter is required", exception.message)
|
47
|
+
end
|
48
|
+
|
49
|
+
test "log_prefix is required" do
|
50
|
+
exception = assert_raise(Fluent::ConfigError) {
|
51
|
+
create_driver(MINIMAL_CONFIG.gsub(/log_prefix.*$/, ''))
|
52
|
+
}
|
53
|
+
assert_equal("'log_prefix' parameter is required", exception.message)
|
54
|
+
end
|
55
|
+
end
|
56
|
+
|
57
|
+
sub_test_case "default values" do
|
58
|
+
test "moved_log_bucket is set to log_bucket" do
|
59
|
+
driver = create_driver(MINIMAL_CONFIG)
|
60
|
+
assert_equal(driver.instance.log_bucket, driver.instance.moved_log_bucket)
|
61
|
+
end
|
62
|
+
|
63
|
+
test "moved_log_prefix is set to '_moved'" do
|
64
|
+
driver = create_driver(MINIMAL_CONFIG)
|
65
|
+
assert_equal('_moved', driver.instance.moved_log_prefix)
|
66
|
+
end
|
67
|
+
|
68
|
+
test "tag is set to 'cloudfront.access'" do
|
69
|
+
driver = create_driver(MINIMAL_CONFIG)
|
70
|
+
assert_equal('cloudfront.access', driver.instance.tag)
|
71
|
+
end
|
72
|
+
|
73
|
+
test "verbose is set to false" do
|
74
|
+
driver = create_driver(MINIMAL_CONFIG)
|
75
|
+
assert_equal(false, driver.instance.verbose)
|
76
|
+
end
|
77
|
+
|
78
|
+
test "interval is set to 300" do
|
79
|
+
driver = create_driver(MINIMAL_CONFIG)
|
80
|
+
assert_equal(300, driver.instance.interval)
|
81
|
+
end
|
82
|
+
|
83
|
+
test "thread_num is set to 4" do
|
84
|
+
driver = create_driver(MINIMAL_CONFIG)
|
85
|
+
assert_equal(4, driver.instance.thread_num)
|
86
|
+
end
|
87
|
+
|
88
|
+
test "s3_get_max is set to 200" do
|
89
|
+
driver = create_driver(MINIMAL_CONFIG)
|
90
|
+
assert_equal(200, driver.instance.s3_get_max)
|
91
|
+
end
|
92
|
+
|
93
|
+
test "parse_date_time true" do
|
94
|
+
driver = create_driver(MINIMAL_CONFIG)
|
95
|
+
assert_equal(true, driver.instance.parse_date_time)
|
96
|
+
end
|
97
|
+
end
|
98
|
+
|
99
|
+
sub_test_case "set specific values" do
|
100
|
+
test "moved_log_prefix is set to 'my-prefix'" do
|
101
|
+
driver = create_driver(MINIMAL_CONFIG + %[
|
102
|
+
moved_log_prefix 'my-prefix'
|
103
|
+
])
|
104
|
+
assert_equal('my-prefix', driver.instance.moved_log_prefix)
|
105
|
+
end
|
106
|
+
end
|
107
|
+
|
108
|
+
end
|
metadata
ADDED
@@ -0,0 +1,149 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: fluent-plugin-cloudfront-log-optimized
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.2.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- kubihee
|
8
|
+
- lenfree
|
9
|
+
- kjwierenga
|
10
|
+
autorequire:
|
11
|
+
bindir: bin
|
12
|
+
cert_chain: []
|
13
|
+
date: 2021-02-06 00:00:00.000000000 Z
|
14
|
+
dependencies:
|
15
|
+
- !ruby/object:Gem::Dependency
|
16
|
+
name: fluentd
|
17
|
+
requirement: !ruby/object:Gem::Requirement
|
18
|
+
requirements:
|
19
|
+
- - ">="
|
20
|
+
- !ruby/object:Gem::Version
|
21
|
+
version: 0.14.0
|
22
|
+
- - "<"
|
23
|
+
- !ruby/object:Gem::Version
|
24
|
+
version: '2'
|
25
|
+
type: :runtime
|
26
|
+
prerelease: false
|
27
|
+
version_requirements: !ruby/object:Gem::Requirement
|
28
|
+
requirements:
|
29
|
+
- - ">="
|
30
|
+
- !ruby/object:Gem::Version
|
31
|
+
version: 0.14.0
|
32
|
+
- - "<"
|
33
|
+
- !ruby/object:Gem::Version
|
34
|
+
version: '2'
|
35
|
+
- !ruby/object:Gem::Dependency
|
36
|
+
name: aws-sdk-s3
|
37
|
+
requirement: !ruby/object:Gem::Requirement
|
38
|
+
requirements:
|
39
|
+
- - "~>"
|
40
|
+
- !ruby/object:Gem::Version
|
41
|
+
version: '1'
|
42
|
+
type: :runtime
|
43
|
+
prerelease: false
|
44
|
+
version_requirements: !ruby/object:Gem::Requirement
|
45
|
+
requirements:
|
46
|
+
- - "~>"
|
47
|
+
- !ruby/object:Gem::Version
|
48
|
+
version: '1'
|
49
|
+
- !ruby/object:Gem::Dependency
|
50
|
+
name: aws-sdk-sqs
|
51
|
+
requirement: !ruby/object:Gem::Requirement
|
52
|
+
requirements:
|
53
|
+
- - "~>"
|
54
|
+
- !ruby/object:Gem::Version
|
55
|
+
version: '1'
|
56
|
+
type: :runtime
|
57
|
+
prerelease: false
|
58
|
+
version_requirements: !ruby/object:Gem::Requirement
|
59
|
+
requirements:
|
60
|
+
- - "~>"
|
61
|
+
- !ruby/object:Gem::Version
|
62
|
+
version: '1'
|
63
|
+
- !ruby/object:Gem::Dependency
|
64
|
+
name: bundler
|
65
|
+
requirement: !ruby/object:Gem::Requirement
|
66
|
+
requirements:
|
67
|
+
- - "~>"
|
68
|
+
- !ruby/object:Gem::Version
|
69
|
+
version: '1.7'
|
70
|
+
type: :development
|
71
|
+
prerelease: false
|
72
|
+
version_requirements: !ruby/object:Gem::Requirement
|
73
|
+
requirements:
|
74
|
+
- - "~>"
|
75
|
+
- !ruby/object:Gem::Version
|
76
|
+
version: '1.7'
|
77
|
+
- !ruby/object:Gem::Dependency
|
78
|
+
name: rake
|
79
|
+
requirement: !ruby/object:Gem::Requirement
|
80
|
+
requirements:
|
81
|
+
- - "~>"
|
82
|
+
- !ruby/object:Gem::Version
|
83
|
+
version: '12'
|
84
|
+
type: :development
|
85
|
+
prerelease: false
|
86
|
+
version_requirements: !ruby/object:Gem::Requirement
|
87
|
+
requirements:
|
88
|
+
- - "~>"
|
89
|
+
- !ruby/object:Gem::Version
|
90
|
+
version: '12'
|
91
|
+
- !ruby/object:Gem::Dependency
|
92
|
+
name: test-unit
|
93
|
+
requirement: !ruby/object:Gem::Requirement
|
94
|
+
requirements:
|
95
|
+
- - "~>"
|
96
|
+
- !ruby/object:Gem::Version
|
97
|
+
version: '2'
|
98
|
+
type: :development
|
99
|
+
prerelease: false
|
100
|
+
version_requirements: !ruby/object:Gem::Requirement
|
101
|
+
requirements:
|
102
|
+
- - "~>"
|
103
|
+
- !ruby/object:Gem::Version
|
104
|
+
version: '2'
|
105
|
+
description: AWS CloudFront log input plugin for fluentd. Upstream appears to be unmaintained.
|
106
|
+
email:
|
107
|
+
- kubihie@gmail.com
|
108
|
+
- lenfree.yeung@gmail.com
|
109
|
+
- k.j.wierenga@gmail.com
|
110
|
+
executables: []
|
111
|
+
extensions: []
|
112
|
+
extra_rdoc_files: []
|
113
|
+
files:
|
114
|
+
- ".gitignore"
|
115
|
+
- CHANGELOG.md
|
116
|
+
- Gemfile
|
117
|
+
- README.md
|
118
|
+
- Rakefile
|
119
|
+
- fluent-plugin-cloudfront-log-optimized.gemspec
|
120
|
+
- lib/fluent/plugin/enumerable_inflater.rb
|
121
|
+
- lib/fluent/plugin/in_cloudfront_log.rb
|
122
|
+
- test/helper.rb
|
123
|
+
- test/plugin/test_in_cloudfrontlog.rb
|
124
|
+
homepage: https://github.com/kjwierenga/fluent-plugin-cloudfront-log-optimized
|
125
|
+
licenses: []
|
126
|
+
metadata: {}
|
127
|
+
post_install_message:
|
128
|
+
rdoc_options: []
|
129
|
+
require_paths:
|
130
|
+
- lib
|
131
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
132
|
+
requirements:
|
133
|
+
- - ">="
|
134
|
+
- !ruby/object:Gem::Version
|
135
|
+
version: '0'
|
136
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
137
|
+
requirements:
|
138
|
+
- - ">="
|
139
|
+
- !ruby/object:Gem::Version
|
140
|
+
version: '0'
|
141
|
+
requirements: []
|
142
|
+
rubygems_version: 3.0.8
|
143
|
+
signing_key:
|
144
|
+
specification_version: 4
|
145
|
+
summary: AWS CloudFront log input plugin optimized for large log files. Credit to
|
146
|
+
kubihie and lenfree.
|
147
|
+
test_files:
|
148
|
+
- test/helper.rb
|
149
|
+
- test/plugin/test_in_cloudfrontlog.rb
|