fluent-plugin-dynamo 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: c19597bfc773c66fa330779d66de8749746cbf7cafcf6e2aa4ea88a4d5b82721
4
+ data.tar.gz: fd2bd597a9f400a4c5a09ac73bdf1f78237e5f4570cada2e1ae54e81b69a1ab2
5
+ SHA512:
6
+ metadata.gz: ced3fba29310892fb2dc725cd5b4852a89390fc48c1261424aadf2c75bf991a0f53f413e03f5c0d5be61cf7ecaebf01ace8dfe52d074fda69ce2625b24f3037b
7
+ data.tar.gz: b556811f8a8a65b315bc2455444cb531cf1a19edc34ae2e3327c135982d0fd807d9b15d4f84e9debcd07277f401e9774852362d5b73bcd58bc1c82cd9af5865e
data/AUTHORS ADDED
@@ -0,0 +1,3 @@
1
+ Takashi Matsuno
2
+ Sadayuki Furuhashi
3
+
data/ChangeLog ADDED
@@ -0,0 +1,20 @@
1
+ Release 0.1.8 - 2012/07/10
2
+
3
+ * Fix gem.homepage url
4
+
5
+ Release 0.1.7 - 2012/06/17
6
+
7
+ * Inherits DetachMultiProcessMixin
8
+
9
+ Release 0.1.6 - 2012/06/12
10
+
11
+ * Optimized write(chunk) method not to collect all records in memory
12
+
13
+ Release 0.1.5 - 2012/06/10
14
+
15
+ * First release
16
+
17
+ Release 0.1.0 - 2012/06/09
18
+
19
+ * First commit
20
+
data/Gemfile ADDED
@@ -0,0 +1,3 @@
1
+ source "http://rubygems.org"
2
+
3
+ gemspec
data/README.md ADDED
@@ -0,0 +1,134 @@
1
+ # Amazon DynamoDB output plugin for [Fluentd](http://fluentd.org) event collector
2
+
3
+ ## Installation
4
+
5
+ $ fluent-gem install fluent-plugin-dynamodb
6
+
7
+ ## Configuration
8
+
9
+
10
+ ### DynamoDB
11
+
12
+ First of all, you need to create a table in DynamoDB. It's easy to create via Management Console.
13
+
14
+ Specify table name, hash attribute name and throughput as you like. fluent-plugin-dynamodb will load your table schema and write event-stream out to your table.
15
+
16
+
17
+ ### Fluentd
18
+
19
+ <match dynamodb.**>
20
+ @type dynamodb
21
+ aws_key_id AWS_ACCESS_KEY
22
+ aws_sec_key AWS_SECRET_ACCESS_KEY
23
+ proxy_uri http://user:password@192.168.0.250:3128/
24
+ dynamo_db_endpoint dynamodb.ap-northeast-1.amazonaws.com
25
+ dynamo_db_table access_log
26
+ </match>
27
+
28
+ * **aws\_key\_id (optional)** - AWS access key id. This parameter is required when your agent is not running on EC2 instance with an IAM Instance Profile.
29
+ * **aws\_sec\_key (optional)** - AWS secret key. This parameter is required when your agent is not running on EC2 instance with an IAM Instance Profile.
30
+ * **proxy_uri (optional)** - your proxy url.
31
+ * **dynamo\_db\_endpoint (required)** - end point of dynamodb. see [Regions and Endpoints](http://docs.amazonwebservices.com/general/latest/gr/rande.html#ddb_region)
32
+ * **dynamo\_db\_table (required)** - table name of dynamodb.
33
+
34
+ ## TIPS
35
+
36
+ ### retrieving data
37
+
38
+ fluent-plugin-dynamo will add **time** attribute and any other attributes of record automatically.
39
+ For example if you read apache's access log via fluentd, structure of the table will have been like this.
40
+
41
+ <table>
42
+ <tr>
43
+ <th>id (Hash Key)</th>
44
+ <th>time</th>
45
+ <th>host</th>
46
+ <th>path</th>
47
+ <th>method</th>
48
+ <th>referer</th>
49
+ <th>code</th>
50
+ <th>agent</th>
51
+ <th>size</th>
52
+ </tr>
53
+ <tr>
54
+ <td>"a937f980-b304-11e1-bc96-c82a14fffef2"</td>
55
+ <td>"2012-06-10T05:26:46Z"</td>
56
+ <td>"192.168.0.3"</td>
57
+ <td>"/index.html"</td>
58
+ <td>"GET"</td>
59
+ <td>"-"</td>
60
+ <td>"200"</td>
61
+ <td>"Mozilla/5.0"</td>
62
+ <td>"4286"</td>
63
+ </tr>
64
+ <tr>
65
+ <td>"a87fc51e-b308-11e1-ba0f-5855caf50759"</td>
66
+ <td>"2012-06-10T05:28:23Z"</td>
67
+ <td>"192.168.0.4"</td>
68
+ <td>"/sample.html"</td>
69
+ <td>"GET"</td>
70
+ <td>"-"</td>
71
+ <td>"200"</td>
72
+ <td>"Mozilla/5.0"</td>
73
+ <td>"8933"</td>
74
+ </tr>
75
+ </table>
76
+
77
+ Item can be retrieved by the key, but fluent-plugin-dynamo uses UUID as a primary key.
78
+ There is no simple way to retrieve logs you want.
79
+ By the way, you can write scan-filter with AWS SDK like [this](https://gist.github.com/2906291), but Hive on EMR is the best practice I think.
80
+
81
+ ### multiprocessing
82
+
83
+ If you need high throughput and if you have much provisioned throughput and abudant buffer, you can setup multiprocessing. fluent-plugin-dynamodb uses **multi workers**, so you can launch 6 workers as follows.
84
+
85
+ <match dynamodb.**>
86
+ @type dynamodb
87
+ aws_key_id AWS_ACCESS_KEY
88
+ aws_sec_key AWS_SECRET_ACCESS_KEY
89
+ proxy_uri http://user:password@192.168.0.250:3128/
90
+ dynamo_db_endpoint dynamodb.ap-northeast-1.amazonaws.com
91
+ dynamo_db_table access_log
92
+ </match>
93
+ <system>
94
+ workers 6
95
+ </system>
96
+
97
+ ### multi-region redundancy
98
+
99
+ As you know fluentd has **copy** output plugin.
100
+ So you can easily setup multi-region redundancy as follows.
101
+
102
+ <match dynamo.**>
103
+ @type copy
104
+ <store>
105
+ @type dynamodb
106
+ aws_key_id AWS_ACCESS_KEY
107
+ aws_sec_key AWS_SECRET_ACCESS_KEY
108
+ dynamo_db_table test
109
+ dynamo_db_endpoint dynamodb.ap-northeast-1.amazonaws.com
110
+ </store>
111
+ <store>
112
+ @type dynamodb
113
+ aws_key_id AWS_ACCESS_KEY
114
+ aws_sec_key AWS_SECRET_ACCESS_KEY
115
+ dynamo_db_table test
116
+ dynamo_db_endpoint dynamodb.ap-southeast-1.amazonaws.com
117
+ </store>
118
+ </match>
119
+
120
+ ## TODO
121
+
122
+ * auto-create table
123
+ * tag_mapped
124
+
125
+ ## Copyright
126
+
127
+ <table>
128
+ <tr>
129
+ <td>Copyright</td><td>Copyright (c) 2012- Takashi Matsuno</td>
130
+ </tr>
131
+ <tr>
132
+ <td>License</td><td>Apache License, Version 2.0</td>
133
+ </tr>
134
+ </table>
data/Rakefile ADDED
@@ -0,0 +1,14 @@
1
+
2
+ require 'bundler'
3
+ Bundler::GemHelper.install_tasks
4
+
5
+ require 'rake/testtask'
6
+
7
+ Rake::TestTask.new(:test) do |test|
8
+ test.libs << 'lib' << 'test'
9
+ test.test_files = FileList['test/*.rb']
10
+ test.verbose = true
11
+ end
12
+
13
+ task :default => [:build]
14
+
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 1.0.0
@@ -0,0 +1,25 @@
1
+ # encoding: utf-8
2
+ $:.push File.expand_path('../lib', __FILE__)
3
+
4
+ Gem::Specification.new do |gem|
5
+ gem.name = "fluent-plugin-dynamo"
6
+ gem.description = "Amazon DynamoDB output plugin for Fluent event collector"
7
+ gem.homepage = "https://github.com/gonsuke/fluent-plugin-dynamodb"
8
+ gem.summary = gem.description
9
+ gem.license = "Apache-2.0"
10
+ gem.version = File.read("VERSION").strip
11
+ gem.authors = ["Takashi Matsuno"]
12
+ gem.email = "g0n5uk3@gmail.com"
13
+ gem.has_rdoc = false
14
+ #gem.platform = Gem::Platform::RUBY
15
+ gem.files = `git ls-files`.split("\n")
16
+ gem.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
17
+ gem.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
18
+ gem.require_paths = ['lib']
19
+
20
+ gem.add_dependency "fluentd", [">= 0.14.15", "< 2"]
21
+ gem.add_dependency "aws-sdk-dynamodb", [">= 1.0.0", "< 2"]
22
+ gem.add_dependency "uuidtools", "~> 2.1.0"
23
+ gem.add_development_dependency "rake", ">= 0.9.2"
24
+ gem.add_development_dependency "test-unit", ">= 3.1.0"
25
+ end
@@ -0,0 +1,143 @@
1
+ # -*- coding: utf-8 -*-
2
+ require 'fluent/plugin/output'
3
+ require 'aws-sdk-dynamodb'
4
+ require 'msgpack'
5
+ require 'time'
6
+ require 'uuidtools'
7
+
8
+ module Fluent::Plugin
9
+
10
+
11
+ class DynamoOutput < Fluent::Plugin::Output
12
+ Fluent::Plugin.register_output('dynamo', self)
13
+
14
+ helpers :compat_parameters
15
+
16
+ DEFAULT_BUFFER_TYPE = "memory"
17
+
18
+ BATCHWRITE_ITEM_LIMIT = 25
19
+ BATCHWRITE_CONTENT_SIZE_LIMIT = 1024*1024
20
+
21
+ config_param :aws_key_id, :string, :default => nil, :secret => true
22
+ config_param :aws_sec_key, :string, :default => nil, :secret => true
23
+ config_param :proxy_uri, :string, :default => nil
24
+ config_param :dynamo_db_region, :string, default: ENV["AWS_REGION"] || "us-east-1"
25
+ config_param :dynamo_db_table, :string
26
+ config_param :dynamo_db_endpoint, :string, :default => nil
27
+ config_param :time_format, :string, :default => nil
28
+ config_param :add_time_attribute, :bool, :default => true
29
+ config_param :detach_process, :integer, :default => 2
30
+
31
+ config_section :buffer do
32
+ config_set_default :@type, DEFAULT_BUFFER_TYPE
33
+ end
34
+
35
+ def configure(conf)
36
+ compat_parameters_convert(conf, :buffer)
37
+ super
38
+
39
+ @timef = Fluent::TimeFormatter.new(@time_format, @localtime)
40
+ end
41
+
42
+ def start
43
+ options = {}
44
+ if @aws_key_id && @aws_sec_key
45
+ options[:access_key_id] = @aws_key_id
46
+ options[:secret_access_key] = @aws_sec_key
47
+ end
48
+ options[:region] = @dynamo_db_region if @dynamo_db_region
49
+ options[:endpoint] = @dynamo_db_endpoint
50
+ options[:proxy_uri] = @proxy_uri if @proxy_uri
51
+
52
+ super
53
+
54
+ begin
55
+ restart_session(options)
56
+ valid_table(@dynamo_db_table)
57
+ rescue Fluent::ConfigError => e
58
+ log.fatal "ConfigError: Please check your configuration, then restart fluentd. '#{e}'"
59
+ exit!
60
+ rescue Exception => e
61
+ log.fatal "UnknownError: '#{e}'"
62
+ exit!
63
+ end
64
+ end
65
+
66
+ def restart_session(options)
67
+ @dynamo_db = Aws::DynamoDB::Client.new(options)
68
+ @resource = Aws::DynamoDB::Resource.new(client: @dynamo_db)
69
+
70
+ end
71
+
72
+ def valid_table(table_name)
73
+ table = @resource.table(table_name)
74
+ @hash_key = table.key_schema.select{|e| e.key_type == "HASH" }.first
75
+ range_key_candidate = table.key_schema.select{|e| e.key_type == "RANGE" }
76
+ @range_key = range_key_candidate.first if range_key_candidate
77
+ end
78
+
79
+ def match_type!(key, record)
80
+ if key.key_type == "NUMBER"
81
+ potential_value = record[key.attribute_name].to_i
82
+ if potential_value == 0
83
+ log.fatal "Failed attempt to cast hash_key to Integer."
84
+ end
85
+ record[key.attribute_name] = potential_value
86
+ end
87
+ end
88
+
89
+ def format(tag, time, record)
90
+ if !record.key?(@hash_key.attribute_name)
91
+ record[@hash_key.attribute_name] = UUIDTools::UUID.timestamp_create.to_s
92
+ end
93
+ match_type!(@hash_key, record)
94
+
95
+ formatted_time = @timef.format(time)
96
+ if @range_key
97
+ if !record.key?(@range_key.attribute_name)
98
+ record[@range_key.attribute_name] = formatted_time
99
+ end
100
+ match_type!(@range_key, record)
101
+ end
102
+ record['time'] = formatted_time if @add_time_attribute
103
+
104
+ record.to_msgpack
105
+ end
106
+
107
+ def formatted_to_msgpack_binary?
108
+ true
109
+ end
110
+
111
+ def multi_workers_ready?
112
+ true
113
+ end
114
+
115
+ def write(chunk)
116
+ batch_size = 0
117
+ batch_records = []
118
+ chunk.msgpack_each {|record|
119
+ batch_records << {
120
+ put_request: {
121
+ item: record
122
+ }
123
+ }
124
+ batch_size += record.to_json.length # FIXME: heuristic
125
+ if batch_records.size >= BATCHWRITE_ITEM_LIMIT || batch_size >= BATCHWRITE_CONTENT_SIZE_LIMIT
126
+ batch_put_records(batch_records)
127
+ batch_records.clear
128
+ batch_size = 0
129
+ end
130
+ }
131
+ unless batch_records.empty?
132
+ batch_put_records(batch_records)
133
+ end
134
+ end
135
+
136
+ def batch_put_records(records)
137
+ @dynamo_db.batch_write_item(request_items: { @dynamo_db_table => records })
138
+ end
139
+
140
+ end
141
+
142
+
143
+ end
@@ -0,0 +1,65 @@
1
+ require 'fluent/test'
2
+ require 'fluent/test/helpers'
3
+ require 'fluent/test/driver/output'
4
+ require 'fluent/plugin/out_dynamodb'
5
+
6
+ class DynamoOutputTest < Test::Unit::TestCase
7
+ include Fluent::Test::Helpers
8
+
9
+ def setup
10
+ Fluent::Test.setup
11
+ end
12
+
13
+ CONFIG = %[
14
+ aws_key_id test_key_id
15
+ aws_sec_key test_sec_key
16
+ dynamo_db_table test_table
17
+ dynamo_db_endpoint test.endpoint
18
+ utc
19
+ buffer_type memory
20
+ ]
21
+
22
+ def create_driver(conf = CONFIG)
23
+ Fluent::Test::Driver::Output.new(Fluent::Plugin::DynamoOutput) do
24
+ def write(chunk)
25
+ chunk.read
26
+ end
27
+ end.configure(conf)
28
+ end
29
+
30
+ def test_configure
31
+ d = create_driver
32
+ assert_equal 'test_key_id', d.instance.aws_key_id
33
+ assert_equal 'test_sec_key', d.instance.aws_sec_key
34
+ assert_equal 'test_table', d.instance.dynamo_db_table
35
+ assert_equal 'test.endpoint', d.instance.dynamo_db_endpoint
36
+ end
37
+
38
+ def test_format
39
+ d = create_driver
40
+
41
+ time = event_time("2011-01-02 13:14:15 UTC")
42
+ d.run(default_tag: 'test') do
43
+ d.feed(time, {"a"=>1})
44
+ d.feed(time, {"a"=>2})
45
+ end
46
+
47
+ expected = [{'a' => 1}].to_msgpack + [{'a' => 2}].to_msgpack
48
+ assert_equal expected, d.formatted
49
+ end
50
+
51
+ def test_write
52
+ d = create_driver
53
+
54
+ time = event_time("2011-01-02 13:14:15 UTC")
55
+ d.run(default_tag: 'test') do
56
+ d.feed(time, {"a"=>1})
57
+ d.feed(time, {"a"=>2})
58
+ end
59
+
60
+ data = d.events
61
+
62
+ assert_equal [time, {'a' => 1}].to_msgpack + [time, {'a' => 2}].to_msgpack, data
63
+ end
64
+
65
+ end
metadata ADDED
@@ -0,0 +1,135 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: fluent-plugin-dynamo
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.0.0
5
+ platform: ruby
6
+ authors:
7
+ - Takashi Matsuno
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2018-09-27 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: fluentd
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ">="
18
+ - !ruby/object:Gem::Version
19
+ version: 0.14.15
20
+ - - "<"
21
+ - !ruby/object:Gem::Version
22
+ version: '2'
23
+ type: :runtime
24
+ prerelease: false
25
+ version_requirements: !ruby/object:Gem::Requirement
26
+ requirements:
27
+ - - ">="
28
+ - !ruby/object:Gem::Version
29
+ version: 0.14.15
30
+ - - "<"
31
+ - !ruby/object:Gem::Version
32
+ version: '2'
33
+ - !ruby/object:Gem::Dependency
34
+ name: aws-sdk-dynamodb
35
+ requirement: !ruby/object:Gem::Requirement
36
+ requirements:
37
+ - - ">="
38
+ - !ruby/object:Gem::Version
39
+ version: 1.0.0
40
+ - - "<"
41
+ - !ruby/object:Gem::Version
42
+ version: '2'
43
+ type: :runtime
44
+ prerelease: false
45
+ version_requirements: !ruby/object:Gem::Requirement
46
+ requirements:
47
+ - - ">="
48
+ - !ruby/object:Gem::Version
49
+ version: 1.0.0
50
+ - - "<"
51
+ - !ruby/object:Gem::Version
52
+ version: '2'
53
+ - !ruby/object:Gem::Dependency
54
+ name: uuidtools
55
+ requirement: !ruby/object:Gem::Requirement
56
+ requirements:
57
+ - - "~>"
58
+ - !ruby/object:Gem::Version
59
+ version: 2.1.0
60
+ type: :runtime
61
+ prerelease: false
62
+ version_requirements: !ruby/object:Gem::Requirement
63
+ requirements:
64
+ - - "~>"
65
+ - !ruby/object:Gem::Version
66
+ version: 2.1.0
67
+ - !ruby/object:Gem::Dependency
68
+ name: rake
69
+ requirement: !ruby/object:Gem::Requirement
70
+ requirements:
71
+ - - ">="
72
+ - !ruby/object:Gem::Version
73
+ version: 0.9.2
74
+ type: :development
75
+ prerelease: false
76
+ version_requirements: !ruby/object:Gem::Requirement
77
+ requirements:
78
+ - - ">="
79
+ - !ruby/object:Gem::Version
80
+ version: 0.9.2
81
+ - !ruby/object:Gem::Dependency
82
+ name: test-unit
83
+ requirement: !ruby/object:Gem::Requirement
84
+ requirements:
85
+ - - ">="
86
+ - !ruby/object:Gem::Version
87
+ version: 3.1.0
88
+ type: :development
89
+ prerelease: false
90
+ version_requirements: !ruby/object:Gem::Requirement
91
+ requirements:
92
+ - - ">="
93
+ - !ruby/object:Gem::Version
94
+ version: 3.1.0
95
+ description: Amazon DynamoDB output plugin for Fluent event collector
96
+ email: g0n5uk3@gmail.com
97
+ executables: []
98
+ extensions: []
99
+ extra_rdoc_files: []
100
+ files:
101
+ - AUTHORS
102
+ - ChangeLog
103
+ - Gemfile
104
+ - README.md
105
+ - Rakefile
106
+ - VERSION
107
+ - fluent-plugin-dynamo.gemspec
108
+ - lib/fluent/plugin/out_dynamo.rb
109
+ - pkg/fluent-plugin-dynamodb-0.2.0.gem
110
+ - test/out_dynamo.rb
111
+ homepage: https://github.com/gonsuke/fluent-plugin-dynamodb
112
+ licenses:
113
+ - Apache-2.0
114
+ metadata: {}
115
+ post_install_message:
116
+ rdoc_options: []
117
+ require_paths:
118
+ - lib
119
+ required_ruby_version: !ruby/object:Gem::Requirement
120
+ requirements:
121
+ - - ">="
122
+ - !ruby/object:Gem::Version
123
+ version: '0'
124
+ required_rubygems_version: !ruby/object:Gem::Requirement
125
+ requirements:
126
+ - - ">="
127
+ - !ruby/object:Gem::Version
128
+ version: '0'
129
+ requirements: []
130
+ rubyforge_project:
131
+ rubygems_version: 2.7.6
132
+ signing_key:
133
+ specification_version: 4
134
+ summary: Amazon DynamoDB output plugin for Fluent event collector
135
+ test_files: []