redis_failover 0.4.0 → 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
data/Changes.md CHANGED
@@ -1,3 +1,7 @@
1
+ 0.5.0
2
+ -----------
3
+ - redis_failover is now built on top of ZooKeeper! This means redis_failover enjoys all of the reliability, redundancy, and data consistency offered by ZooKeeper. The old fragile HTTP-based approach has been removed and will no longer be supported in favor of ZooKeeper. This does mean that in order to use redis_failover, you must have ZooKeeper installed and running. Please see the README for steps on how to do this if you don't already have ZooKeeper running in your production environment.
4
+
1
5
  0.4.0
2
6
  -----------
3
7
  - No longer force newly available slaves to master if already slaves of that master
data/README.md CHANGED
@@ -1,8 +1,8 @@
1
- # Automatic Redis Failover Client/Server
1
+ # Automatic Redis Failover Client / Node Manager
2
2
 
3
3
  [![Build Status](https://secure.travis-ci.org/ryanlecompte/redis_failover.png?branch=master)](http://travis-ci.org/ryanlecompte/redis_failover)
4
4
 
5
- Redis Failover attempts to provides a full automatic master/slave failover solution for Ruby. Redis does not provide
5
+ redis_failover attempts to provides a full automatic master/slave failover solution for Ruby. Redis does not provide
6
6
  an automatic failover capability when configured for master/slave replication. When the master node dies,
7
7
  a new master must be manually brought online and assigned as the slave's new master. This manual
8
8
  switch-over is not desirable in high traffic sites where Redis is a critical part of the overall
@@ -10,30 +10,34 @@ architecture. The existing standard Redis client for Ruby also only supports con
10
10
  Redis server. When using master/slave replication, it is desirable to have all writes go to the
11
11
  master, and all reads go to one of the N configured slaves.
12
12
 
13
- This gem attempts to address both the server and client problems. A redis failover server runs as a background
14
- daemon and monitors all of your configured master/slave nodes. When the server starts up, it
15
- automatically discovers who is the master and who are the slaves. Watchers are setup for each of
13
+ This gem attempts to address these failover scenarios. A redis failover Node Manager daemon runs as a background
14
+ process and monitors all of your configured master/slave nodes. When the daemon starts up, it
15
+ automatically discovers the current master/slaves. Background watchers are setup for each of
16
16
  the redis nodes. As soon as a node is detected as being offline, it will be moved to an "unavailable" state.
17
17
  If the node that went offline was the master, then one of the slaves will be promoted as the new master.
18
18
  All existing slaves will be automatically reconfigured to point to the new master for replication.
19
19
  All nodes marked as unavailable will be periodically checked to see if they have been brought back online.
20
- If so, the newly available nodes will be configured as slaves and brought back into the list of live
21
- servers. Note that detection of a node going down should be nearly instantaneous, since the mechanism
20
+ If so, the newly available nodes will be configured as slaves and brought back into the list of available
21
+ nodes. Note that detection of a node going down should be nearly instantaneous, since the mechanism
22
22
  used to keep tabs on a node is via a blocking Redis BLPOP call (no polling). This call fails nearly
23
23
  immediately when the node actually goes offline. To avoid false positives (i.e., intermittent flaky
24
- network interruption), the server will only mark a node as unavailable if it fails to communicate with
24
+ network interruption), the Node Manager will only mark a node as unavailable if it fails to communicate with
25
25
  it 3 times (this is configurable via --max-failures, see configuration options below).
26
26
 
27
27
  This gem provides a RedisFailover::Client wrapper that is master/slave aware. The client is configured
28
- with a single host/port pair that points to redis failover server. The client will automatically
29
- connect to the server to find out the current state of the world (i.e., who's the current master and
30
- who are the current slaves). The client also acts as a load balancer in that it will automatically
28
+ with a list of ZooKeeper servers. The client will automatically contact the ZooKeeper cluster to find out
29
+ the current state of the world (i.e., who is the current master and who are the current slaves). The client
30
+ also sets up a ZooKeeper watcher for the set of redis nodes controlled by the Node Manager daemon. When the daemon
31
+ promotes a new master or detects a node as going down, ZooKeeper will notify the client near-instantaneously so
32
+ that it can rebuild its set of Redis connections. The client also acts as a load balancer in that it will automatically
31
33
  dispatch Redis read operations to one of N slaves, and Redis write operations to the master.
32
- If it fails to communicate with any node, it will go back and ask the server for the current list of
33
- available servers, and then optionally retry the operation.
34
+ If it fails to communicate with any node, it will go back and fetch the current list of available servers, and then
35
+ optionally retry the operation.
34
36
 
35
37
  ## Installation
36
38
 
39
+ redis_failover has an external dependency on ZooKeeper. You must have a running ZooKeeper cluster already available in order to use redis_failover. ZooKeeper provides redis_failover with its high availability and data consistency between Redis::Failover clients and the Node Manager daemon. Please see the requirements section below for more information on installing and setting up ZooKeeper if you don't have it running already.
40
+
37
41
  Add this line to your application's Gemfile:
38
42
 
39
43
  gem 'redis_failover'
@@ -46,69 +50,68 @@ Or install it yourself as:
46
50
 
47
51
  $ gem install redis_failover
48
52
 
49
- ## Server Usage
53
+ ## Node Manager Daemon Usage
50
54
 
51
- The redis failover server is a simple process that should be run as a background daemon. The server supports the
55
+ The Node Manager is a simple process that should be run as a background daemon. The daemon supports the
52
56
  following options:
53
57
 
54
- Usage: redis_failover_server [OPTIONS]
55
- -P, --port port Server port
56
- -p, --password password Redis password
57
- -n, --nodes nodes Comma-separated redis host:port pairs
58
- --max-failures count Max failures before server marks node unavailable (default 3)
58
+ Usage: redis_node_manager [OPTIONS]
59
+ -p, --password password Redis password (optional)
60
+ -n, --nodes redis nodes Comma-separated redis host:port pairs (required)
61
+ -z zookeeper servers, Comma-separated zookeeper host:port pairs (required)
62
+ --zkservers
63
+ --znode-path path Znode path override for storing redis server list (optional)
64
+ --max-failures count Max failures before manager marks node unavailable (default 3)
59
65
  -h, --help Display all options
60
66
 
61
- To start the server for a simple master/slave configuration, use the following:
67
+ To start the daemon for a simple master/slave configuration, use the following:
62
68
 
63
- redis_failover_server -P 3000 -n localhost:6379,localhost:6380
69
+ redis_node_manager -n localhost:6379,localhost:6380 -z localhost:2181,localhost:2182,localhost:2183
64
70
 
65
- The server will automatically figure out who is the master and who is the slave upon startup. Note that it is
66
- a good idea to monitor the redis failover server process with a tool like Monit to ensure that it is restarted
71
+ The Node Manager will automatically discover the master/slaves upon startup. Note that it is
72
+ a good idea to monitor the redis Node Manager daemon process with a tool like Monit to ensure that it is restarted
67
73
  in the case of a failure.
68
74
 
69
75
  ## Client Usage
70
76
 
71
- The redis failover client must be used in conjunction with a running redis failover server. The
72
- client supports various configuration options, however the two mandatory options are the host
73
- and port of the redis failover server:
77
+ The redis failover client must be used in conjunction with a running Node Manager daemon. The
78
+ client supports various configuration options, however the only mandatory option is the list of
79
+ ZooKeeper servers:
74
80
 
75
- client = RedisFailover::Client.new(:host => 'localhost', :port => 3000)
81
+ client = RedisFailover::Client.new(:zkservers => 'localhost:2181,localhost:2182,localhost:2183')
76
82
 
77
83
  The client actually employs the common redis and redis-namespace gems underneath, so this should be
78
84
  a drop-in replacement for your existing pure redis client usage.
79
85
 
80
86
  The full set of options that can be passed to RedisFailover::Client are:
81
87
 
82
- :host - redis failover server host (required)
83
- :port - redis failover server port (required)
84
- :password - optional password for redis nodes
85
- :namespace - optional namespace for redis nodes
86
- :logger - optional logger override
88
+ :zkservers - comma-separated zookeeper host:port pairs (required)
89
+ :znode_path - the Znode path override for redis server list (optional)
90
+ :password - password for redis nodes (optional)
91
+ :namespace - namespace for redis nodes (optional)
92
+ :logger - logger override (optional)
87
93
  :retry_failure - indicate if failures should be retried (default true)
88
94
  :max_retries - max retries for a failure (default 3)
89
95
 
90
96
  ## Requirements
91
97
 
92
- redis_failover is actively tested against MRI 1.9.2/1.9.3. Other rubies may work, although I don't actively test against them. 1.8 is not supported.
98
+ - redis_failover is actively tested against MRI 1.9.2/1.9.3. Other rubies may work, although I don't actively test against them. 1.8 is not supported.
99
+ - redis_failover requires a ZooKeeper service cluster to ensure reliability and data consistency. ZooKeeper is very simple and easy to get up and running. Please refer to this [Zookeper Guide](http://zookeeper.apache.org/doc/trunk/zookeeperStarted.html) to get up and running quickly if you don't already have ZooKeeper as a part of your environment.
93
100
 
94
101
  ## Considerations
95
102
 
96
- Note that by default the failover server will mark slaves that are currently syncing with their master as "available" based on the configuration value set for "slave-serve-stale-data" in redis.conf. By default this value is set to "yes" in the configuration, which means that slaves still syncing with their master will be available for servicing read requests. If you don't want this behavior, just set "slave-serve-stale-data" to "no" in your redis.conf file.
97
-
98
- ## Limitations
103
+ - Note that by default the Node Manager will mark slaves that are currently syncing with their master as "available" based on the configuration value set for "slave-serve-stale-data" in redis.conf. By default this value is set to "yes" in the configuration, which means that slaves still syncing with their master will be available for servicing read requests. If you don't want this behavior, just set "slave-serve-stale-data" to "no" in your redis.conf file.
99
104
 
100
- The redis_failover gem currently has limitations. It currently does not gracefully handle network partitions. In cases where
101
- the network splits, it is possible that more than one master could exist until the failover server sees all of the nodes again.
102
- If the failover client gets split from the failover server, it's also possible that it could be talking to a stale master. This would get corrected once the client could successfully reach the failover server again to fetch the latest set of master/slave nodes. This is a limitation that I hope to address in a future release. The gem can not guarantee data consistencies until this is addressed.
105
+ - Note that it's still possible for the RedisFailover::Client instances to see a stale list of servers for a very small window. In most cases this will not be the case due to how ZooKeeper handles distributed communication, but you should be aware that in the worst case the client could write to a "stale" master for a small period of time until the next watch event is received by the client via ZooKeeper.
103
106
 
104
107
  ## TODO
105
108
 
106
- - Integrate with ZooKeeper for full reliability / data consistency.
107
- - Rework specs to work against a set of real Redis nodes as opposed to stubs.
109
+ - Rework specs to work against a set of real Redis/ZooKeeper nodes as opposed to stubs.
108
110
 
109
111
  ## Resources
110
112
 
111
- To learn more about Redis master/slave replication, see the [Redis documentation](http://redis.io/topics/replication).
113
+ - To learn more about Redis master/slave replication, see the [Redis documentation](http://redis.io/topics/replication).
114
+ - To learn more about ZooKeeper, see the official [ZooKeeper](http://zookeeper.apache.org/) site.
112
115
 
113
116
  ## License
114
117
 
File without changes
@@ -1,3 +1,4 @@
1
+ require 'zk'
1
2
  require 'redis'
2
3
  require 'thread'
3
4
  require 'logger'
@@ -11,8 +12,8 @@ require 'redis_failover/util'
11
12
  require 'redis_failover/node'
12
13
  require 'redis_failover/errors'
13
14
  require 'redis_failover/client'
14
- require 'redis_failover/server'
15
15
  require 'redis_failover/runner'
16
16
  require 'redis_failover/version'
17
+ require 'redis_failover/zk_client'
17
18
  require 'redis_failover/node_manager'
18
19
  require 'redis_failover/node_watcher'
@@ -6,25 +6,32 @@ module RedisFailover
6
6
 
7
7
  options = {}
8
8
  parser = OptionParser.new do |opts|
9
- opts.banner = "Usage: redis_failover_server [OPTIONS]"
9
+ opts.banner = "Usage: redis_node_manager [OPTIONS]"
10
10
 
11
- opts.on('-P', '--port port', 'Server port') do |port|
12
- options[:port] = Integer(port)
13
- end
14
-
15
- opts.on('-p', '--password password', 'Redis password') do |password|
11
+ opts.on('-p', '--password password', 'Redis password (optional)') do |password|
16
12
  options[:password] = password.strip
17
13
  end
18
14
 
19
- opts.on('-n', '--nodes nodes', 'Comma-separated redis host:port pairs') do |nodes|
15
+ opts.on('-n', '--nodes redis nodes',
16
+ 'Comma-separated redis host:port pairs (required)') do |nodes|
20
17
  # turns 'host1:port,host2:port' => [{:host => host, :port => port}, ...]
21
18
  options[:nodes] = nodes.split(',').map do |node|
22
19
  Hash[[:host, :port].zip(node.strip.split(':'))]
23
20
  end
24
21
  end
25
22
 
23
+ opts.on('-z', '--zkservers zookeeper servers',
24
+ 'Comma-separated zookeeper host:port pairs (required)') do |servers|
25
+ options[:zkservers] = servers
26
+ end
27
+
28
+ opts.on('--znode-path path',
29
+ 'Znode path override for storing redis server list (optional)') do |path|
30
+ options[:znode_path] = path
31
+ end
32
+
26
33
  opts.on('--max-failures count',
27
- 'Max failures before server marks node unavailable (default 3)') do |max|
34
+ 'Max failures before manager marks node unavailable (default 3)') do |max|
28
35
  options[:max_failures] = Integer(max)
29
36
  end
30
37
 
@@ -35,7 +42,6 @@ module RedisFailover
35
42
  end
36
43
 
37
44
  parser.parse(source)
38
-
39
45
  # assume password is same for all redis nodes
40
46
  if password = options[:password]
41
47
  options[:nodes].each { |opts| opts.update(:password => password) }
@@ -1,5 +1,4 @@
1
1
  require 'set'
2
- require 'open-uri'
3
2
 
4
3
  module RedisFailover
5
4
  # Redis failover-aware client.
@@ -7,7 +6,6 @@ module RedisFailover
7
6
  include Util
8
7
 
9
8
  RETRY_WAIT_TIME = 3
10
- REDIS_ERRORS = Errno.constants.map { |c| Errno.const_get(c) }.freeze
11
9
  REDIS_READ_OPS = Set[
12
10
  :echo,
13
11
  :exists,
@@ -67,29 +65,27 @@ module RedisFailover
67
65
  #
68
66
  # Options:
69
67
  #
70
- # :host - redis failover server host (required)
71
- # :port - redis failover server port (required)
72
- # :password - optional password for redis nodes
73
- # :namespace - optional namespace for redis nodes
74
- # :logger - optional logger override
68
+ # :zkservers - comma-separated zookeeper host:port pairs (required)
69
+ # :znode_path - the Znode path override for redis server list (optional)
70
+ # :password - password for redis nodes (optional)
71
+ # :namespace - namespace for redis nodes (optional)
72
+ # :logger - logger override (optional)
75
73
  # :retry_failure - indicate if failures should be retried (default true)
76
74
  # :max_retries - max retries for a failure (default 5)
77
75
  #
78
76
  def initialize(options = {})
79
- unless options.values_at(:host, :port).all?
80
- raise ArgumentError, ':host and :port options required'
81
- end
82
-
83
77
  Util.logger = options[:logger] if options[:logger]
78
+ @zkservers = options.fetch(:zkservers) { raise ArgumentError, ':zkservers required'}
79
+ @znode = options[:znode_path] || Util::DEFAULT_ZNODE_PATH
84
80
  @namespace = options[:namespace]
85
81
  @password = options[:password]
86
82
  @retry = options[:retry_failure] || true
87
83
  @max_retries = @retry ? options.fetch(:max_retries, 3) : 0
88
- @server_url = "http://#{options[:host]}:#{options[:port]}/redis_servers"
89
84
  @master = nil
90
85
  @slaves = []
86
+ @lock = Mutex.new
87
+ setup_zookeeper_client
91
88
  build_clients
92
- start_background_monitor
93
89
  end
94
90
 
95
91
  def method_missing(method, *args, &block)
@@ -111,6 +107,30 @@ module RedisFailover
111
107
 
112
108
  private
113
109
 
110
+ def setup_zookeeper_client
111
+ @zkclient = ZkClient.new(@zkservers)
112
+
113
+ # when session expires / we are disconnected, purge client list
114
+ @zkclient.on_session_expiration do
115
+ @lock.synchronize { purge_clients }
116
+ end
117
+ @zkclient.event_handler.register_state_handler(:connecting) do
118
+ @lock.synchronize { purge_clients }
119
+ end
120
+
121
+ # register a watcher for future changes
122
+ @zkclient.watcher.register(@znode) do |event|
123
+ if event.node_created? || event.node_changed?
124
+ build_clients
125
+ elsif event.node_deleted?
126
+ @zkclient.stat( @znode, :watch => true)
127
+ @lock.synchronize { purge_clients }
128
+ else
129
+ logger.error("Unknown ZK node event: #{event.inspect}")
130
+ end
131
+ end
132
+ end
133
+
114
134
  def redis_operation?(method)
115
135
  Redis.public_instance_methods(false).include?(method)
116
136
  end
@@ -127,12 +147,11 @@ module RedisFailover
127
147
  # direct everything else to master
128
148
  master.send(method, *args, &block)
129
149
  end
130
- rescue Error, *REDIS_ERRORS
131
- logger.error("No suitable node available for operation `#{method}.`")
132
- build_clients
133
-
150
+ rescue *ALL_ERRORS => ex
151
+ logger.error("Error while handling operation `#{method}` - #{ex.message}")
134
152
  if tries < @max_retries
135
153
  tries += 1
154
+ build_clients
136
155
  sleep(RETRY_WAIT_TIME) && retry
137
156
  end
138
157
 
@@ -141,7 +160,7 @@ module RedisFailover
141
160
  end
142
161
 
143
162
  def master
144
- master = @master
163
+ master = @lock.synchronize { @master }
145
164
  if master
146
165
  verify_role!(master, :master)
147
166
  return master
@@ -151,7 +170,7 @@ module RedisFailover
151
170
 
152
171
  def slave
153
172
  # pick a slave, if none available fallback to master
154
- if slave = @slaves.sample
173
+ if slave = @lock.synchronize { @slaves.sample }
155
174
  verify_role!(slave, :slave)
156
175
  return slave
157
176
  end
@@ -159,39 +178,40 @@ module RedisFailover
159
178
  end
160
179
 
161
180
  def build_clients
162
- tries = 0
163
-
164
- begin
165
- logger.info('Checking for new redis nodes.')
166
- nodes = fetch_nodes
167
- return unless nodes_changed?(nodes)
168
-
169
- logger.info('Node change detected, rebuilding clients.')
170
- master = new_clients_for(nodes[:master]).first if nodes[:master]
171
- slaves = new_clients_for(*nodes[:slaves])
172
-
173
- # once clients are successfully created, swap the references
174
- @master = master
175
- @slaves = slaves
176
- rescue => ex
177
- logger.error("Failed to fetch nodes from #{@server_url} - #{ex.message}")
178
- logger.error(ex.backtrace.join("\n"))
181
+ @lock.synchronize do
182
+ tries = 0
183
+
184
+ begin
185
+ nodes = fetch_nodes
186
+ return unless nodes_changed?(nodes)
187
+
188
+ purge_clients
189
+ logger.info("Building new clients for nodes #{nodes}")
190
+ new_master = new_clients_for(nodes[:master]).first if nodes[:master]
191
+ new_slaves = new_clients_for(*nodes[:slaves])
192
+ @master = new_master
193
+ @slaves = new_slaves
194
+ rescue *ALL_ERRORS => ex
195
+ purge_clients
196
+ logger.error("Failed to fetch nodes from #{@zkservers} - #{ex.message}")
197
+ logger.error(ex.backtrace.join("\n"))
198
+
199
+ if tries < @max_retries
200
+ tries += 1
201
+ sleep(RETRY_WAIT_TIME) && retry
202
+ end
179
203
 
180
- if tries < @max_retries
181
- tries += 1
182
- sleep(RETRY_WAIT_TIME) && retry
204
+ raise
183
205
  end
184
-
185
- raise FailoverServerUnavailableError.new(@server_url)
186
206
  end
187
207
  end
188
208
 
189
209
  def fetch_nodes
190
- open(@server_url) do |io|
191
- nodes = symbolize_keys(MultiJson.decode(io))
192
- logger.info("Fetched nodes: #{nodes}")
193
- nodes
194
- end
210
+ data = @zkclient.get(@znode, :watch => true).first
211
+ nodes = symbolize_keys(decode(data))
212
+ logger.debug("Fetched nodes: #{nodes}")
213
+
214
+ nodes
195
215
  end
196
216
 
197
217
  def new_clients_for(*nodes)
@@ -243,20 +263,23 @@ module RedisFailover
243
263
  false
244
264
  end
245
265
 
246
- # Spawns a background thread to periodically fetch the latest
247
- # set of nodes from the redis failover server.
248
- def start_background_monitor
249
- Thread.new do
250
- loop do
251
- sleep(10)
266
+ def disconnect(*connections)
267
+ connections.each do |conn|
268
+ if conn
252
269
  begin
253
- build_clients
254
- rescue => ex
255
- logger.error("Failed to poll for new nodes from #{@server_url} - #{ex.message}")
256
- logger.error(ex.backtrace.join("\n"))
270
+ conn.client.disconnect
271
+ rescue
272
+ # best effort
257
273
  end
258
274
  end
259
275
  end
260
276
  end
277
+
278
+ def purge_clients
279
+ logger.info("Purging current redis clients")
280
+ disconnect(@master, *@slaves)
281
+ @master = nil
282
+ @slaves = []
283
+ end
261
284
  end
262
285
  end
@@ -28,12 +28,6 @@ module RedisFailover
28
28
  class NoSlaveError < Error
29
29
  end
30
30
 
31
- class FailoverServerUnavailableError < Error
32
- def initialize(failover_server_url)
33
- super("Unable to access #{failover_server_url}")
34
- end
35
- end
36
-
37
31
  class InvalidNodeRoleError < Error
38
32
  def initialize(node, assumed, actual)
39
33
  super("Invalid role detected for node #{node}, client thought " +
@@ -52,6 +52,7 @@ module RedisFailover
52
52
  perform_operation do |redis|
53
53
  unless slave_of?(master)
54
54
  redis.slaveof(master.host, master.port)
55
+ logger.info("#{self} now a slave of master #{master}")
55
56
  wakeup
56
57
  end
57
58
  end
@@ -61,6 +62,7 @@ module RedisFailover
61
62
  perform_operation do |redis|
62
63
  unless master?
63
64
  redis.slaveof('no', 'one')
65
+ logger.info("#{self} is now master")
64
66
  wakeup
65
67
  end
66
68
  end
@@ -5,13 +5,15 @@ module RedisFailover
5
5
 
6
6
  def initialize(options)
7
7
  @options = options
8
- @master, @slaves = parse_nodes
8
+ @zkclient = ZkClient.new(@options[:zkservers])
9
+ @znode = @options[:znode_path] || Util::DEFAULT_ZNODE_PATH
9
10
  @unavailable = []
10
11
  @queue = Queue.new
11
- @lock = Mutex.new
12
+ discover_nodes
12
13
  end
13
14
 
14
15
  def start
16
+ initialize_path
15
17
  spawn_watchers
16
18
  handle_state_changes
17
19
  end
@@ -20,16 +22,6 @@ module RedisFailover
20
22
  @queue << [node, state]
21
23
  end
22
24
 
23
- def nodes
24
- @lock.synchronize do
25
- {
26
- :master => @master ? @master.to_s : nil,
27
- :slaves => @slaves.map(&:to_s),
28
- :unavailable => @unavailable.map(&:to_s)
29
- }
30
- end
31
- end
32
-
33
25
  def shutdown
34
26
  @watchers.each(&:shutdown)
35
27
  end
@@ -38,20 +30,20 @@ module RedisFailover
38
30
 
39
31
  def handle_state_changes
40
32
  while state_change = @queue.pop
41
- @lock.synchronize do
33
+ begin
42
34
  node, state = state_change
43
- begin
44
- case state
45
- when :unavailable then handle_unavailable(node)
46
- when :available then handle_available(node)
47
- when :syncing then handle_syncing(node)
48
- else raise InvalidNodeStateError.new(node, state)
49
- end
50
- rescue NodeUnavailableError
51
- # node suddenly became unavailable, silently
52
- # handle since the watcher will take care of
53
- # keeping track of the node
35
+ case state
36
+ when :unavailable then handle_unavailable(node)
37
+ when :available then handle_available(node)
38
+ when :syncing then handle_syncing(node)
39
+ else raise InvalidNodeStateError.new(node, state)
54
40
  end
41
+
42
+ # flush current state
43
+ write_state
44
+ rescue *ALL_ERRORS => ex
45
+ logger.error("Error while handling #{state_change.inspect}: #{ex.message}")
46
+ logger.error(ex.backtrace.join("\n"))
55
47
  end
56
48
  end
57
49
  end
@@ -104,6 +96,7 @@ module RedisFailover
104
96
  end
105
97
 
106
98
  def promote_new_master(node = nil)
99
+ delete_path
107
100
  @master = nil
108
101
 
109
102
  # make a specific node or slave the new master
@@ -116,18 +109,21 @@ module RedisFailover
116
109
  candidate.make_master!
117
110
  @master = candidate
118
111
  redirect_slaves_to_master
112
+ create_path
113
+ write_state
119
114
  logger.info("Successfully promoted #{candidate} to master.")
120
115
  end
121
116
 
122
- def parse_nodes
117
+ def discover_nodes
123
118
  nodes = @options[:nodes].map { |opts| Node.new(opts) }.uniq
124
- raise NoMasterError unless master = find_master(nodes)
125
- slaves = nodes - [master]
119
+ raise NoMasterError unless @master = find_master(nodes)
120
+ @slaves = nodes - [@master]
126
121
 
127
- logger.info("Managing master (#{master}) and slaves" +
128
- " (#{slaves.map(&:to_s).join(', ')})")
122
+ # ensure that slaves are correctly pointing to this master
123
+ redirect_slaves_to_master
129
124
 
130
- [master, slaves]
125
+ logger.info("Managing master (#{@master}) and slaves" +
126
+ " (#{@slaves.map(&:to_s).join(', ')})")
131
127
  end
132
128
 
133
129
  def spawn_watchers
@@ -169,6 +165,7 @@ module RedisFailover
169
165
  return if @master == node && node.master?
170
166
  return if @master && node.slave_of?(@master)
171
167
 
168
+ logger.info("Reconciling node #{node}")
172
169
  if @master == node && !node.master?
173
170
  # we think the node is a master, but the node doesn't
174
171
  node.make_master!
@@ -180,5 +177,37 @@ module RedisFailover
180
177
  node.make_slave!(@master)
181
178
  end
182
179
  end
180
+
181
+ def current_nodes
182
+ {
183
+ :master => @master ? @master.to_s : nil,
184
+ :slaves => @slaves.map(&:to_s),
185
+ :unavailable => @unavailable.map(&:to_s)
186
+ }
187
+ end
188
+
189
+ def delete_path
190
+ if @zkclient.stat(@znode).exists?
191
+ @zkclient.delete(@znode)
192
+ logger.info("Deleted zookeeper node #{@znode}")
193
+ end
194
+ end
195
+
196
+ def create_path
197
+ # create nodes path if it doesn't already exist in ZK
198
+ unless @zkclient.stat(@znode).exists?
199
+ @zkclient.create(@znode, encode(current_nodes))
200
+ logger.info("Created zookeeper node #{@znode}")
201
+ end
202
+ end
203
+
204
+ def initialize_path
205
+ create_path
206
+ write_state
207
+ end
208
+
209
+ def write_state
210
+ @zkclient.set(@znode, encode(current_nodes))
211
+ end
183
212
  end
184
213
  end
@@ -1,14 +1,13 @@
1
1
  module RedisFailover
2
- # Runner is responsible for bootstrapping the redis failover server.
2
+ # Runner is responsible for bootstrapping the redis Node Manager.
3
3
  class Runner
4
4
  def self.run(options)
5
5
  options = CLI.parse(options)
6
- Util.logger.info("Redis Failover Server starting on port #{options[:port]}")
7
- Server.set(:port, options[:port])
8
6
  @node_manager = NodeManager.new(options)
9
- server_thread = Thread.new { Server.run! { |server| trap_signals } }
7
+ trap_signals
10
8
  node_manager_thread = Thread.new { @node_manager.start }
11
- [server_thread, node_manager_thread].each(&:join)
9
+ Util.logger.info("Redis Node Manager successfully started.")
10
+ node_manager_thread.join
12
11
  end
13
12
 
14
13
  def self.node_manager
@@ -1,8 +1,18 @@
1
+ require 'redis_failover/errors'
2
+
1
3
  module RedisFailover
2
4
  # Common utiilty methods.
3
5
  module Util
4
6
  extend self
5
7
 
8
+ DEFAULT_ZNODE_PATH = '/redis_failover_nodes'
9
+ REDIS_ERRORS = Errno.constants.map { |c| Errno.const_get(c) }
10
+ ALL_ERRORS = [
11
+ RedisFailover::Error,
12
+ ZookeeperExceptions::ZookeeperException,
13
+ REDIS_ERRORS,
14
+ StandardError].flatten
15
+
6
16
  def symbolize_keys(hash)
7
17
  Hash[hash.map { |k, v| [k.to_sym, v] }]
8
18
  end
@@ -29,5 +39,14 @@ module RedisFailover
29
39
  def logger
30
40
  Util.logger
31
41
  end
42
+
43
+ def encode(data)
44
+ MultiJson.encode(data)
45
+ end
46
+
47
+ def decode(data)
48
+ return unless data
49
+ MultiJson.decode(data)
50
+ end
32
51
  end
33
52
  end
@@ -1,3 +1,3 @@
1
1
  module RedisFailover
2
- VERSION = "0.4.0"
2
+ VERSION = "0.5.0"
3
3
  end
@@ -0,0 +1,78 @@
1
+ module RedisFailover
2
+ # ZkClient is a thin wrapper over the ZK client to gracefully handle reconnects
3
+ # when a session expires.
4
+ class ZkClient
5
+ include Util
6
+
7
+ MAX_RECONNECTS = 3
8
+
9
+ def initialize(servers)
10
+ @servers = servers
11
+ @lock = Mutex.new
12
+ build_client
13
+ end
14
+
15
+ def on_session_expiration(&block)
16
+ @client.on_expired_session { block.call }
17
+ @on_session_expiration = block
18
+ end
19
+
20
+ def get(*args, &block)
21
+ perform_with_reconnect { @client.get(*args, &block) }
22
+ end
23
+
24
+ def set(*args, &block)
25
+ perform_with_reconnect { @client.set(*args, &block) }
26
+ end
27
+
28
+ def watcher(*args, &block)
29
+ perform_with_reconnect { @client.watcher(*args, &block) }
30
+ end
31
+
32
+ def event_handler(*args, &block)
33
+ perform_with_reconnect { @client.event_handler(*args, &block) }
34
+ end
35
+
36
+ def stat(*args, &block)
37
+ perform_with_reconnect { @client.stat(*args, &block) }
38
+ end
39
+
40
+ def create(*args, &block)
41
+ perform_with_reconnect { @client.create(*args, &block) }
42
+ end
43
+
44
+ def delete(*args, &block)
45
+ perform_with_reconnect { @client.delete(*args, &block) }
46
+ end
47
+
48
+ private
49
+
50
+ def perform_with_reconnect
51
+ tries = 0
52
+ begin
53
+ yield
54
+ rescue ZookeeperExceptions::ZookeeperException::SessionExpired
55
+ logger.info("Zookeeper client session expired, rebuilding client.")
56
+ if tries < MAX_RECONNECTS
57
+ tries += 1
58
+ build_client
59
+ @on_session_expiration.call if @on_session_expiration
60
+ sleep(2) && retry
61
+ end
62
+
63
+ raise
64
+ end
65
+ end
66
+
67
+ def build_client
68
+ @lock.synchronize do
69
+ if @client
70
+ @client.reopen
71
+ else
72
+ @client = ZK.new(@servers)
73
+ end
74
+ logger.info("Communicating with zookeeper servers #{@servers}")
75
+ end
76
+ end
77
+ end
78
+ end
@@ -4,8 +4,8 @@ require File.expand_path('../lib/redis_failover/version', __FILE__)
4
4
  Gem::Specification.new do |gem|
5
5
  gem.authors = ["Ryan LeCompte"]
6
6
  gem.email = ["lecompte@gmail.com"]
7
- gem.description = %(Redis Failover provides a full automatic master/slave failover solution for Ruby)
8
- gem.summary = %(Redis Failover provides a full automatic master/slave failover solution for Ruby)
7
+ gem.description = %(Redis Failover is a ZooKeeper-based automatic master/slave failover solution for Ruby)
8
+ gem.summary = %(Redis Failover is a ZooKeeper-based automatic master/slave failover solution for Ruby)
9
9
  gem.homepage = "http://github.com/ryanlecompte/redis_failover"
10
10
 
11
11
  gem.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
@@ -18,7 +18,7 @@ Gem::Specification.new do |gem|
18
18
  gem.add_dependency('redis')
19
19
  gem.add_dependency('redis-namespace')
20
20
  gem.add_dependency('multi_json')
21
- gem.add_dependency('sinatra')
21
+ gem.add_dependency('zk')
22
22
 
23
23
  gem.add_development_dependency('rake')
24
24
  gem.add_development_dependency('rspec')
data/spec/cli_spec.rb CHANGED
@@ -7,11 +7,6 @@ module RedisFailover
7
7
  CLI.parse({}).should == {}
8
8
  end
9
9
 
10
- it 'properly parses a server port' do
11
- opts = CLI.parse(['-P 2222'])
12
- opts.should == {:port => 2222}
13
- end
14
-
15
10
  it 'properly parses redis nodes' do
16
11
  opts = CLI.parse(['-n host1:1,host2:2,host3:3'])
17
12
  opts[:nodes].should == [
data/spec/client_spec.rb CHANGED
@@ -18,10 +18,12 @@ module RedisFailover
18
18
  :unavailable => []
19
19
  }
20
20
  end
21
+
22
+ def setup_zookeeper_client; end
21
23
  end
22
24
 
23
25
  describe Client do
24
- let(:client) { ClientStub.new(:host => 'localhost', :port => 3000) }
26
+ let(:client) { ClientStub.new(:zkservers => 'localhost:9281') }
25
27
 
26
28
  describe '#build_clients' do
27
29
  it 'properly parses master' do
@@ -31,18 +33,6 @@ module RedisFailover
31
33
  it 'properly parses slaves' do
32
34
  client.current_slaves.first.to_s.should == 'localhost:1111'
33
35
  end
34
-
35
- it 'does not rebuild clients if hosts have not changed' do
36
- class << client
37
- attr_reader :built_new_client
38
- def new_clients_for(*)
39
- @built_new_client = true
40
- end
41
- end
42
-
43
- 5.times { client.send(:build_clients) }
44
- client.built_new_client.should_not be_true
45
- end
46
36
  end
47
37
 
48
38
  describe '#dispatch' do
@@ -57,7 +47,7 @@ module RedisFailover
57
47
  client.get('foo')
58
48
  end
59
49
 
60
- it 'reconnects with redis failover server when node is unavailable' do
50
+ it 'reconnects when node is unavailable' do
61
51
  class << client
62
52
  attr_reader :reconnected
63
53
  def build_clients
@@ -80,12 +70,6 @@ module RedisFailover
80
70
  client.reconnected.should be_true
81
71
  end
82
72
 
83
- it 'fails hard when the failover server is unavailable' do
84
- expect do
85
- Client.new(:host => 'foo', :port => 123445)
86
- end.to raise_error(FailoverServerUnavailableError)
87
- end
88
-
89
73
  it 'properly detects when a node has changed roles' do
90
74
  client.current_master.change_role_to('slave')
91
75
  expect { client.send(:master) }.to raise_error(InvalidNodeRoleError)
@@ -6,7 +6,7 @@ module RedisFailover
6
6
 
7
7
  describe '#nodes' do
8
8
  it 'returns current master and slave nodes' do
9
- manager.nodes.should == {
9
+ manager.current_nodes.should == {
10
10
  :master => 'master:6379',
11
11
  :slaves => ['slave:6379'],
12
12
  :unavailable => []
@@ -19,7 +19,7 @@ module RedisFailover
19
19
  it 'moves slave to unavailable list' do
20
20
  slave = manager.slaves.first
21
21
  manager.force_unavailable(slave)
22
- manager.nodes[:unavailable].should include(slave.to_s)
22
+ manager.current_nodes[:unavailable].should include(slave.to_s)
23
23
  end
24
24
  end
25
25
 
@@ -35,7 +35,7 @@ module RedisFailover
35
35
  end
36
36
 
37
37
  it 'moves master to unavailable list' do
38
- manager.nodes[:unavailable].should include(@master.to_s)
38
+ manager.current_nodes[:unavailable].should include(@master.to_s)
39
39
  end
40
40
  end
41
41
  end
@@ -50,8 +50,8 @@ module RedisFailover
50
50
  context 'slave node with a master present' do
51
51
  it 'removes slave from unavailable list' do
52
52
  manager.force_available(@slave)
53
- manager.nodes[:unavailable].should be_empty
54
- manager.nodes[:slaves].should include(@slave.to_s)
53
+ manager.current_nodes[:unavailable].should be_empty
54
+ manager.current_nodes[:slaves].should include(@slave.to_s)
55
55
  end
56
56
 
57
57
  it 'makes node a slave of new master' do
@@ -76,13 +76,12 @@ module RedisFailover
76
76
  end
77
77
 
78
78
  it 'promotes slave to master' do
79
- manager.master.should be_nil
80
79
  manager.force_available(@slave)
81
80
  manager.master.should == @slave
82
81
  end
83
82
 
84
83
  it 'slaves list remains empty' do
85
- manager.nodes[:slaves].should be_empty
84
+ manager.current_nodes[:slaves].should be_empty
86
85
  end
87
86
  end
88
87
  end
@@ -92,7 +91,7 @@ module RedisFailover
92
91
  it 'adds node to unavailable list' do
93
92
  slave = manager.slaves.first
94
93
  manager.force_syncing(slave, false)
95
- manager.nodes[:unavailable].should include(slave.to_s)
94
+ manager.current_nodes[:unavailable].should include(slave.to_s)
96
95
  end
97
96
  end
98
97
 
@@ -100,8 +99,8 @@ module RedisFailover
100
99
  it 'makes node available' do
101
100
  slave = manager.slaves.first
102
101
  manager.force_syncing(slave, true)
103
- manager.nodes[:unavailable].should_not include(slave.to_s)
104
- manager.nodes[:slaves].should include(slave.to_s)
102
+ manager.current_nodes[:unavailable].should_not include(slave.to_s)
103
+ manager.current_nodes[:slaves].should include(slave.to_s)
105
104
  end
106
105
  end
107
106
  end
data/spec/spec_helper.rb CHANGED
@@ -9,7 +9,10 @@ class NullObject
9
9
  end
10
10
  end
11
11
 
12
- RedisFailover::Util.logger = NullObject.new
12
+ module RedisFailover
13
+ Util.logger = NullObject.new
14
+ def ZkClient.new(*args); NullObject.new; end
15
+ end
13
16
 
14
17
  RSpec.configure do |config|
15
18
  end
@@ -1,14 +1,16 @@
1
1
  module RedisFailover
2
2
  class NodeManagerStub < NodeManager
3
3
  attr_accessor :master
4
+ public :current_nodes
4
5
 
5
- def parse_nodes
6
+ def discover_nodes
6
7
  master = Node.new(:host => 'master')
7
8
  slave = Node.new(:host => 'slave')
8
9
  [master, slave].each { |node| node.extend(RedisStubSupport) }
9
10
  master.make_master!
10
11
  slave.make_slave!(master)
11
- [master, [slave]]
12
+ @master = master
13
+ @slaves = [slave]
12
14
  end
13
15
 
14
16
  def slaves
@@ -44,5 +46,10 @@ module RedisFailover
44
46
  notify_state_change(node, :syncing)
45
47
  stop_processing
46
48
  end
49
+
50
+ def initialize_path; end
51
+ def delete_path; end
52
+ def create_path; end
53
+ def write_state; end
47
54
  end
48
55
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: redis_failover
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.0
4
+ version: 0.5.0
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,11 +9,11 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2012-04-15 00:00:00.000000000 Z
12
+ date: 2012-04-17 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: redis
16
- requirement: &70210559246440 !ruby/object:Gem::Requirement
16
+ requirement: &70300698370600 !ruby/object:Gem::Requirement
17
17
  none: false
18
18
  requirements:
19
19
  - - ! '>='
@@ -21,10 +21,10 @@ dependencies:
21
21
  version: '0'
22
22
  type: :runtime
23
23
  prerelease: false
24
- version_requirements: *70210559246440
24
+ version_requirements: *70300698370600
25
25
  - !ruby/object:Gem::Dependency
26
26
  name: redis-namespace
27
- requirement: &70210559246000 !ruby/object:Gem::Requirement
27
+ requirement: &70300698370180 !ruby/object:Gem::Requirement
28
28
  none: false
29
29
  requirements:
30
30
  - - ! '>='
@@ -32,10 +32,10 @@ dependencies:
32
32
  version: '0'
33
33
  type: :runtime
34
34
  prerelease: false
35
- version_requirements: *70210559246000
35
+ version_requirements: *70300698370180
36
36
  - !ruby/object:Gem::Dependency
37
37
  name: multi_json
38
- requirement: &70210559245580 !ruby/object:Gem::Requirement
38
+ requirement: &70300698369740 !ruby/object:Gem::Requirement
39
39
  none: false
40
40
  requirements:
41
41
  - - ! '>='
@@ -43,10 +43,10 @@ dependencies:
43
43
  version: '0'
44
44
  type: :runtime
45
45
  prerelease: false
46
- version_requirements: *70210559245580
46
+ version_requirements: *70300698369740
47
47
  - !ruby/object:Gem::Dependency
48
- name: sinatra
49
- requirement: &70210559245160 !ruby/object:Gem::Requirement
48
+ name: zk
49
+ requirement: &70300698369320 !ruby/object:Gem::Requirement
50
50
  none: false
51
51
  requirements:
52
52
  - - ! '>='
@@ -54,10 +54,10 @@ dependencies:
54
54
  version: '0'
55
55
  type: :runtime
56
56
  prerelease: false
57
- version_requirements: *70210559245160
57
+ version_requirements: *70300698369320
58
58
  - !ruby/object:Gem::Dependency
59
59
  name: rake
60
- requirement: &70210559244740 !ruby/object:Gem::Requirement
60
+ requirement: &70300698368900 !ruby/object:Gem::Requirement
61
61
  none: false
62
62
  requirements:
63
63
  - - ! '>='
@@ -65,10 +65,10 @@ dependencies:
65
65
  version: '0'
66
66
  type: :development
67
67
  prerelease: false
68
- version_requirements: *70210559244740
68
+ version_requirements: *70300698368900
69
69
  - !ruby/object:Gem::Dependency
70
70
  name: rspec
71
- requirement: &70210559244320 !ruby/object:Gem::Requirement
71
+ requirement: &70300698368480 !ruby/object:Gem::Requirement
72
72
  none: false
73
73
  requirements:
74
74
  - - ! '>='
@@ -76,13 +76,13 @@ dependencies:
76
76
  version: '0'
77
77
  type: :development
78
78
  prerelease: false
79
- version_requirements: *70210559244320
80
- description: Redis Failover provides a full automatic master/slave failover solution
79
+ version_requirements: *70300698368480
80
+ description: Redis Failover is a ZooKeeper-based automatic master/slave failover solution
81
81
  for Ruby
82
82
  email:
83
83
  - lecompte@gmail.com
84
84
  executables:
85
- - redis_failover_server
85
+ - redis_node_manager
86
86
  extensions: []
87
87
  extra_rdoc_files: []
88
88
  files:
@@ -93,7 +93,7 @@ files:
93
93
  - LICENSE
94
94
  - README.md
95
95
  - Rakefile
96
- - bin/redis_failover_server
96
+ - bin/redis_node_manager
97
97
  - lib/redis_failover.rb
98
98
  - lib/redis_failover/cli.rb
99
99
  - lib/redis_failover/client.rb
@@ -102,9 +102,9 @@ files:
102
102
  - lib/redis_failover/node_manager.rb
103
103
  - lib/redis_failover/node_watcher.rb
104
104
  - lib/redis_failover/runner.rb
105
- - lib/redis_failover/server.rb
106
105
  - lib/redis_failover/util.rb
107
106
  - lib/redis_failover/version.rb
107
+ - lib/redis_failover/zk_client.rb
108
108
  - redis_failover.gemspec
109
109
  - spec/cli_spec.rb
110
110
  - spec/client_spec.rb
@@ -129,7 +129,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
129
129
  version: '0'
130
130
  segments:
131
131
  - 0
132
- hash: -4081812195470581448
132
+ hash: -1870245960706248512
133
133
  required_rubygems_version: !ruby/object:Gem::Requirement
134
134
  none: false
135
135
  requirements:
@@ -138,14 +138,14 @@ required_rubygems_version: !ruby/object:Gem::Requirement
138
138
  version: '0'
139
139
  segments:
140
140
  - 0
141
- hash: -4081812195470581448
141
+ hash: -1870245960706248512
142
142
  requirements: []
143
143
  rubyforge_project:
144
144
  rubygems_version: 1.8.16
145
145
  signing_key:
146
146
  specification_version: 3
147
- summary: Redis Failover provides a full automatic master/slave failover solution for
148
- Ruby
147
+ summary: Redis Failover is a ZooKeeper-based automatic master/slave failover solution
148
+ for Ruby
149
149
  test_files:
150
150
  - spec/cli_spec.rb
151
151
  - spec/client_spec.rb
@@ -1,13 +0,0 @@
1
- require 'sinatra'
2
-
3
- module RedisFailover
4
- # Serves as an endpoint for discovering the current redis master and slaves.
5
- class Server < Sinatra::Base
6
- disable :logging
7
-
8
- get '/redis_servers' do
9
- content_type :json
10
- MultiJson.encode(Runner.node_manager.nodes)
11
- end
12
- end
13
- end