fluent-plugin-pg-logical 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 7f9e6ce82c9348118dd65343c808d07ca9a5a970
4
+ data.tar.gz: 5b658de92f88870b16136a19279f88dd013120b9
5
+ SHA512:
6
+ metadata.gz: 1f2d70eda0cf9e4b111fa48820650b3f26ee9911a6f9e1f90858a282ff9c59021dad528e99cfd2142c1d1e4212112636c6ea4099510b6f7924d48009f593b0f6
7
+ data.tar.gz: 5491912d4db77d84a9564c85e471ee4c67aaa2cd402ebf0b215522e06685e5d3f50823abeed5bb0c40f8efb574ca69ba47d308742476eaebf808f8629ebd580e
@@ -0,0 +1,5 @@
1
+ *.gem
2
+ .bundle
3
+ Gemfile.lock
4
+ pkg/*
5
+ vendor/*
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in fluent-plugin-mysql-replicator.gemspec
4
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,14 @@
1
+ Copyright (c) 2018- Masahiko Sawada
2
+
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+
7
+ http://www.apache.org/licenses/LICENSE-2.0
8
+
9
+ Unless required by applicable law or agreed to in writing, software
10
+ distributed under the License is distributed on an "AS IS" BASIS,
11
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ See the License for the specific language governing permissions and
13
+ limitations under the License.
14
+
@@ -0,0 +1,110 @@
1
+ # fluent-plugin-pg-logical
2
+
3
+ ## Overview
4
+
5
+ Fluentd input plugin to track of changes (insert/update/delete) event on PostgreSQL using logical decoding.
6
+
7
+ This plugin works as a WAL receiver of PostgreSQL and requires installation of logical decoding plugin to upstream PostgreSQL server.
8
+
9
+ ## Installation
10
+
11
+ install with gem or fluent-gem command as:
12
+
13
+ `````
14
+ # for system installed fluentd
15
+ $ gem install fluent-plugin-pg-logical
16
+ `````
17
+
18
+ ## Configuration
19
+
20
+ |Parameter|Type|Default|Remarks|
21
+ |:--------|:---|:------|:----------|
22
+ |host|string|'localhost'|-|
23
+ |port|integer|5432|-|
24
+ |user|string|'postgres'|-|
25
+ |password|string|nil|-|
26
+ |dbname|string|'postgres'|-|
27
+ |slotanme|string|nil|Required|
28
+ |plugin|string|nil|Required if 'create_slot' is specified|
29
+ |status_interval|integer|10|Specifies the minimum frequency to send information about replication progress to upstream server|
30
+ |tag|string|nil|-|
31
+ |create_slot|bool|false|Specify to create the specified replication slot before start|
32
+ |if_not_exists|bool|false|Do not error if slot already exists when creating a slot|
33
+
34
+ ## Restriction
35
+ * Because logical decoding support only data changes (i.g. INSERT/UPDATE/DELETE), other changes such as DDL, sequence doesn't appear on fluentd input
36
+ * Replication slots are reuiqred as much as you connect with fluent-plugin-pg-logical
37
+
38
+ ## Example with wal2json
39
+ fluent-plugin-pg-logical requires a logical decoding plugin to get logical change set.This is a example of use of fluent-plugin-pg-logical with [wal2json](https://github.com/eulerto/wal2json), which decodes WAL to json object.
40
+
41
+ 1. Install wal2json to PostgreSQL
42
+ Please refer to "Build and Install" section in wal2json documentation.
43
+
44
+ 2. Setting Configuration Parameters
45
+ ```
46
+ <source>
47
+ @type pg_logical
48
+ host pgserver
49
+ port 5432
50
+ user postgres
51
+ dbname replication_db
52
+ slotname wal2json_slot
53
+ plugin wal2json
54
+ create_slot true
55
+ if_not_exists true
56
+ </source>
57
+ ```
58
+
59
+ 3. Run fluentd
60
+ Launch fluentd.
61
+
62
+ 4. Issue some SQL
63
+ ```sql
64
+ =# CREATE TABLE hoge (c int primary key);
65
+ CREATE TABLE
66
+ =#INSERT INTO hoge VALUES (1), (2), (3);
67
+ INSERT 0 3
68
+ =# BEGIN;
69
+ BEGIN
70
+ =# UPDATE hoge SET c = c + 10 WHERE c = 1;
71
+ UPDATE 1
72
+ =# UPDATE hoge SET c = c + 20 WHERE c = 2;
73
+ UPDATE 1
74
+ =# COMMIT;
75
+ COMMIT
76
+ ```
77
+
78
+ You will get,
79
+
80
+ ```
81
+ 2018-02-03 16:02:20.073058428 +0900 : "{\"change\":[]}"
82
+ 2018-02-03 16:02:38.266394490 +0900 : "{\"change\":[{\"kind\":\"insert\",\"schema\":\"public\",\"table\":\"hoge\",\"columnnames\":[\"c\"],\"columntypes\":[\"integer\"],\"columnvalues\":[1]},{\"kind\":\"insert\",\"schema\":\"public\",\"table\":\"hoge\",\"columnnames\":[\"c\"],\"columntypes\":[\"integer\"],\"columnvalues\":[2]},{\"kind\":\"insert\",\"schema\":\"public\",\"table\":\"hoge\",\"columnnames\":[\"c\"],\"columntypes\":[\"integer\"],\"columnvalues\":[3]}]}"
83
+ 2018-02-03 16:03:05.890485185 +0900 : "{\"change\":[{\"kind\":\"update\",\"schema\":\"public\",\"table\":\"hoge\",\"columnnames\":[\"c\"],\"columntypes\":[\"integer\"],\"columnvalues\":[11],\"oldkeys\":{\"keynames\":[\"c\"],\"keytypes\":[\"integer\"],\"keyvalues\":[1]}},{\"kind\":\"update\",\"schema\":\"public\",\"table\":\"hoge\",\"columnnames\":[\"c\"],\"columntypes\":[\"integer\"],\"columnvalues\":[22],\"oldkeys\":{\"keynames\":[\"c\"],\"keytypes\":[\"integer\"],\"keyvalues\":[2]}}]}"
84
+ ```
85
+ Because current (at least up to version 10) PostgreSQL doesn't support DDL replication, `CREATE TABLE` command doesn't appear to fluentd input.
86
+
87
+
88
+ You can also monitor the activity of fluent-plugin-pg-logical on upstream server.
89
+
90
+ ```sql
91
+ =# SELECT usename, application_name, sent_location, write_location, flush_location FROM pg_stat_replication ;
92
+
93
+ usename | application_name | sent_location | write_location | flush_location
94
+ ----------+------------------+---------------+----------------+----------------
95
+ masahiko | pg-logical | 0/15ADD70 | 0/15ADAC8 | 0/15ADAC8
96
+ (1 row)
97
+
98
+ ```
99
+
100
+ ## TODO
101
+ * Add travis test
102
+ * Table filtering
103
+
104
+ ## Copyright
105
+
106
+ Copyright © 2018- Masahiko Sawada
107
+
108
+ ## License
109
+
110
+ Apache License, Version 2.0
@@ -0,0 +1,9 @@
1
+ require "bundler/gem_tasks"
2
+ require "rake/testtask"
3
+ Rake::TestTask.new(:test) do |test|
4
+ test.libs << 'lib' << 'test'
5
+ test.pattern = 'test/**/test_*.rb'
6
+ test.verbose = true
7
+ end
8
+
9
+ task :default => :test
@@ -0,0 +1,24 @@
1
+ # -*- encoding: utf-8 -*-
2
+ Gem::Specification.new do |s|
3
+ s.name = "fluent-plugin-pg-logical"
4
+ s.version = "0.0.1"
5
+ s.authors = ["Masahiko Sawada"]
6
+ s.email = ["sawada.mshk@gmail.com"]
7
+ s.homepage = "https://github.com/MasahikoSawada/fluent-plugin-pg-logical"
8
+ s.summary = %q{Fluentd input plugin to track of changes on PostgreSQL server using logical decoding}
9
+ s.license = "Apache-2.0"
10
+
11
+ s.files = `git ls-files`.split("\n")
12
+ s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
13
+ s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
14
+ s.require_paths = ["lib"]
15
+
16
+ s.required_ruby_version = "> 2.1"
17
+
18
+ s.add_development_dependency "rake"
19
+ s.add_development_dependency "webmock", "~> 1.24.0"
20
+ s.add_development_dependency "test-unit", ">= 3.1.0"
21
+
22
+ s.add_runtime_dependency "fluentd"
23
+ s.add_runtime_dependency "pg"
24
+ end
@@ -0,0 +1,311 @@
1
+ require 'fluent/input'
2
+
3
+ module Fluent
4
+ class PgLogicalInput < Fluent::Input
5
+ # Register input plugin
6
+ Plugin.register_input( 'pg_logical', self)
7
+
8
+ def initialize
9
+ require 'pg'
10
+ super
11
+ end
12
+
13
+ config_param :host, :string, :default => 'localhost'
14
+ config_param :port, :integer, :default => 5432
15
+ config_param :user, :string, :default => 'postgres'
16
+ config_param :password, :string, :default => nil, :secret => true
17
+ config_param :dbname, :string, :default => 'postgres'
18
+ config_param :slotname, :string, :default => nil
19
+ config_param :plugin, :string, :default => nil
20
+ config_param :status_interval, :integer, :default => 10
21
+ config_param :tag, :string, :default => nil
22
+ config_param :create_slot, :bool, :default=> false
23
+ config_param :if_not_exists, :bool, :default => false
24
+ def configure(conf)
25
+ super
26
+
27
+ # 'slot_name' parameter is required.
28
+ if (@slotname.nil?)
29
+ raise Fluent::ConfigError, "pg-logical: missing 'slotname' parameter."
30
+ end
31
+
32
+ # If 'create_slot' parameter is specified, 'plugin' name is required.
33
+ if (!@create_slot.nil? and @plugin.nil?)
34
+ raise Fluent::ConfigError, "pg-logical: 'create_slot' parameter reuiqres to specify 'plugin' parameter."
35
+ end
36
+
37
+ log.info ":host=>#{host} :dbname=>#{dbname} :port=>#{port} :user=>#{user} :tag=>#{tag} :slotname=>#{slotname} :plugin=>#{plugin} :status_interval=>#{status_interval}"
38
+ end
39
+
40
+ def start
41
+ @thread = Thread.new(&method(:run))
42
+ end
43
+
44
+ def shutdown
45
+ if (!@conn.nil?)
46
+ @conn.put_copy_end()
47
+ @conn.flush()
48
+ end
49
+
50
+ Thread.kill(@thread)
51
+ end
52
+
53
+ def run
54
+ begin
55
+ streamLogicalLog
56
+ rescue StandardError => e
57
+ log.error "pg_logical: failed to execute query."
58
+ log.error "error: #{e.message}"
59
+ log.error e.backtrace.join("\n")
60
+ end
61
+ end
62
+
63
+ # Start logical replication
64
+ def start_streaming
65
+ # Identify system, and get start lsn
66
+ res = @conn.exec("IDENTIFY_SYSTEM")
67
+ systemid = res.getvalue(0, 0)
68
+ tli = res.getvalue(0, 1)
69
+ xlogpos = res.getvalue(0, 2)
70
+ dbname = res.getvalue(0, 3)
71
+
72
+ # Start logical replication
73
+ strbuf = "START_REPLICATION SLOT %s LOGICAL %s" % [@slotname, xlogpos]
74
+ @conn.exec(strbuf)
75
+ end
76
+
77
+ # Get a connection
78
+ def get_connection
79
+ begin
80
+ return PG::connect(
81
+ :host => @host,
82
+ :port => @port,
83
+ :user => @user,
84
+ :password => @password,
85
+ :dbname => @dbname,
86
+ :application_name => 'pg-logical',
87
+ :replication => "database"
88
+ )
89
+ rescue Exception => e
90
+ log.warn "pg-logical: #{e}"
91
+ sleep 5
92
+ retry
93
+ end
94
+ end
95
+
96
+ # Main routine of pg-logical plugin. Stream logical WAL.
97
+ def streamLogicalLog
98
+ begin
99
+ @conn = get_connection()
100
+
101
+ # Create replication slot if required
102
+ create_replication_slot()
103
+
104
+ # Start replication
105
+ start_streaming()
106
+
107
+ record = nil
108
+ socket = @conn.socket_io
109
+ time_to_abort = false
110
+ last_status = Time.now
111
+ loop do
112
+ # Get current timestamp
113
+ now = Time.now
114
+
115
+ # Send feedback if necessary
116
+ last_status = sendFeedback(now, last_status, false)
117
+
118
+ # Get a decoded WAL decode
119
+ record = @conn.get_copy_data(true)
120
+
121
+ # In async mode, and no data available. We block on reading but
122
+ # not more than the specified timeout, so that we can send a
123
+ # response back to the client.# In asynchronou mode,
124
+ if (record == false)
125
+ # XXX: maybe better to use libev?
126
+ r = select([socket], [], [], 10.0)
127
+
128
+ if (r.nil?)
129
+ # Got a timeout or signal. Continue the loop and either
130
+ # deliver a status packet to the server or just go back into
131
+ # blocking.
132
+ next
133
+ end
134
+
135
+ # There is actual data on socket, consume it.
136
+ @conn.consume_input()
137
+ next
138
+ end
139
+
140
+ # record is nil means that copy is done.
141
+ if (record.nil?)
142
+ next
143
+ end
144
+
145
+ # Process a record, get extracted record
146
+ wal = extractRecord(record)
147
+
148
+ if (wal[:type] == 'w') # WAL data
149
+ #log.info "[GET w] start : #{wal[:start_lsn]}, end : #{wal[:end_lsn]}, time : #{wal[:send_time]}, data : #{wal[:data]}"
150
+ last_status = sendFeedback(now, last_status, true)
151
+
152
+ @router.emit(@tag, Fluent::Engine.now, wal[:data])
153
+
154
+ elsif (wal[:type] == 'k') # Keepalive data
155
+ #log.info "[GET k] end : #{wal[:end_lsn]}, time : #{wal[:send_time]}, reply_required : #{wal[:reply_required]}"
156
+
157
+ if (wal[:reply_required] == 1)
158
+ last_status = sendFeedback(now, last_status, true)
159
+ end
160
+ end
161
+ end
162
+ rescue Exception => e
163
+ log.warn "pg-logical: #{e}"
164
+ sleep 5
165
+ retry
166
+ ensure
167
+ @conn.finish if !@conn.nil?
168
+ end
169
+ end
170
+
171
+ # Return extracted WAL data into a hash map
172
+ def extractRecord(record)
173
+ r = record.unpack("a")
174
+ wal = {}
175
+
176
+ if (r[0] == 'w') # WAL data
177
+ # -- WAL data format ------
178
+ # 1. 'w' : byte
179
+ # 2. start_lsn : uint64
180
+ # 3. end_lsn : uint64
181
+ # 4. send_time : uint64
182
+ # 5. data
183
+ # ------------------------
184
+ r = record.unpack("aNNNNNNc*")
185
+
186
+ start_lsn_h = r[1]
187
+ start_lsn_l = r[2]
188
+ end_lsn_h = r[3]
189
+ end_lsn_l = r[4]
190
+ send_time_h = r[5]
191
+ send_time_l = r[6]
192
+ data = r[7 .. r.size].pack("C*")
193
+
194
+ start_lsn = (start_lsn_h << 32) + start_lsn_l
195
+ end_lsn = (end_lsn_h << 32) + end_lsn_l
196
+ send_time = (send_time_h << 32) + send_time_l
197
+
198
+ wal[:type] = 'w'
199
+ wal[:start_lsn] = start_lsn
200
+ wal[:end_lsn] = end_lsn
201
+ wal[:send_time] = send_time
202
+ wal[:data] = data
203
+ elsif (r[0] == 'k') # keepalive message
204
+ # -- Keepalive format ------
205
+ # 1. 'k' : byte
206
+ # 2. end_lsn : uint64
207
+ # 3. send_time : uint64
208
+ # 4. reply_required : byte
209
+ # ------------------------
210
+ r = record.unpack("aNNNNc")
211
+
212
+ end_lsn_h = r[1]
213
+ end_lsn_l = r[2]
214
+ send_time_h = r[3]
215
+ send_time_l = r[4]
216
+ reply_required = r[5]
217
+
218
+ end_lsn = (end_lsn_h << 32) + end_lsn_l
219
+ send_time = (send_time_h << 32) + send_time_l
220
+
221
+ wal[:type] = 'k'
222
+ wal[:end_lsn] = end_lsn
223
+ wal[:send_time] = send_time
224
+ wal[:reply_required] = reply_required
225
+ end
226
+
227
+ # Update reveive lsn
228
+ if (@recv_lsn.nil? or wal[:end_lsn] > @recv_lsn)
229
+ @recv_lsn = wal[:end_lsn]
230
+ end
231
+
232
+ return wal
233
+ end
234
+
235
+ # Return the last feedback time
236
+ def sendFeedback(now, last_status, force)
237
+
238
+ # If the user doesn't want status to be reported the
239
+ # upstream server, be sure to exit before doing anything
240
+ # at all.
241
+ if (!force and now - last_status < @status_interval)
242
+ return last_status
243
+ end
244
+
245
+ # Report current status to upstream server
246
+ if (!@recv_lsn.nil?)
247
+ # -- Feedback format ------
248
+ # 1. 'r' : byte
249
+ # 2. write_lsn : uint64
250
+ # 3. flush_lsn : uint64
251
+ # 3. apply_lsn : uint64
252
+ # 4. send_time : uint64
253
+ # 5. reply_required : byte
254
+ # ------------------------
255
+ feedback_msg = ['r']
256
+
257
+ recv_lsn_h = @recv_lsn >> 32
258
+ recv_lsn_l = @recv_lsn & 0xFFFFFFFF
259
+
260
+ # write
261
+ feedback_msg.push(recv_lsn_h)
262
+ feedback_msg.push(recv_lsn_l)
263
+
264
+ # flush
265
+ feedback_msg.push(recv_lsn_h)
266
+ feedback_msg.push(recv_lsn_l)
267
+
268
+ # apply
269
+ feedback_msg.push(0)
270
+ feedback_msg.push(0)
271
+
272
+ # send_time
273
+ now_h = now.to_i >> 32
274
+ now_l = now.to_i & 0xFFFFFFFF
275
+ feedback_msg.push(now_h)
276
+ feedback_msg.push(now_l)
277
+
278
+ # Require reply
279
+ feedback_msg.push(0)
280
+ packed = feedback_msg.pack("aN8c")
281
+
282
+ @conn.flush
283
+ if (!@conn.put_copy_data(packed))
284
+ raise "error"
285
+ end
286
+
287
+ # Update last_status as we've sent
288
+ last_status = now
289
+ end
290
+
291
+ return last_status
292
+ end
293
+
294
+ # Create a replication slot
295
+ def create_replication_slot
296
+ begin
297
+ strbuf = "CREATE_REPLICATION_SLOT %s LOGICAL %s" % [@slotname, @plugin]
298
+ puts strbuf
299
+ @conn.exec(strbuf)
300
+ rescue PG::Error
301
+ # If if_not_exists is set, ignore the error
302
+ if (@if_not_exists)
303
+ log.info "pg-logical: could not create replication slot %s" % @slotname
304
+ return
305
+ end
306
+
307
+ log.error "pg-logical: could not create replication slot %s" % @slotname
308
+ end
309
+ end
310
+ end
311
+ end
metadata ADDED
@@ -0,0 +1,122 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: fluent-plugin-pg-logical
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.0.1
5
+ platform: ruby
6
+ authors:
7
+ - Masahiko Sawada
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2018-02-05 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: rake
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ">="
18
+ - !ruby/object:Gem::Version
19
+ version: '0'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ">="
25
+ - !ruby/object:Gem::Version
26
+ version: '0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: webmock
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: 1.24.0
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: 1.24.0
41
+ - !ruby/object:Gem::Dependency
42
+ name: test-unit
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ">="
46
+ - !ruby/object:Gem::Version
47
+ version: 3.1.0
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ">="
53
+ - !ruby/object:Gem::Version
54
+ version: 3.1.0
55
+ - !ruby/object:Gem::Dependency
56
+ name: fluentd
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - ">="
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :runtime
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - ">="
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ - !ruby/object:Gem::Dependency
70
+ name: pg
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - ">="
74
+ - !ruby/object:Gem::Version
75
+ version: '0'
76
+ type: :runtime
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - ">="
81
+ - !ruby/object:Gem::Version
82
+ version: '0'
83
+ description:
84
+ email:
85
+ - sawada.mshk@gmail.com
86
+ executables: []
87
+ extensions: []
88
+ extra_rdoc_files: []
89
+ files:
90
+ - ".gitignore"
91
+ - Gemfile
92
+ - LICENSE
93
+ - README.md
94
+ - Rakefile
95
+ - fluent-plugin-pg-logical.gemspec
96
+ - lib/fluent/plugin/in_pg_logical.rb
97
+ homepage: https://github.com/MasahikoSawada/fluent-plugin-pg-logical
98
+ licenses:
99
+ - Apache-2.0
100
+ metadata: {}
101
+ post_install_message:
102
+ rdoc_options: []
103
+ require_paths:
104
+ - lib
105
+ required_ruby_version: !ruby/object:Gem::Requirement
106
+ requirements:
107
+ - - ">"
108
+ - !ruby/object:Gem::Version
109
+ version: '2.1'
110
+ required_rubygems_version: !ruby/object:Gem::Requirement
111
+ requirements:
112
+ - - ">="
113
+ - !ruby/object:Gem::Version
114
+ version: '0'
115
+ requirements: []
116
+ rubyforge_project:
117
+ rubygems_version: 2.6.13
118
+ signing_key:
119
+ specification_version: 4
120
+ summary: Fluentd input plugin to track of changes on PostgreSQL server using logical
121
+ decoding
122
+ test_files: []