mongo_ha 1.11.0.rc1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 40d6a7af7f740daf8f07e5f79713ae0b2ad76e2d
4
+ data.tar.gz: 1a0ba4bdda6f79ea283d1544f926d1901b500a1f
5
+ SHA512:
6
+ metadata.gz: 58f85d47132a40bf22cad95bc38044590b3fbcb0dbf1a78171f3dc3ae29aa559a7ff836867f3926465ecb2fdc5809de3447d50318112fc7665e39313c21ce355
7
+ data.tar.gz: a9e5aae3321e9e17f5ed1f401a952c8e39a60c1bea47345c4c5a461135a95f10e56411bb087fbc07a115882facaca62b010b6d17640cd35140d646116645de87
data/README.md ADDED
@@ -0,0 +1,162 @@
1
+ # mongo_ha
2
+
3
+ High availability for the mongo ruby driver. Automatic reconnects and recovery when replica-set changes, etc.
4
+
5
+ ## Status
6
+
7
+ Production Ready: Used every day in an enterprise environment across
8
+ remote data centers.
9
+
10
+ ## Overview
11
+
12
+ Adds methods to the Mongo Ruby driver to support retries on connection failure.
13
+
14
+ In the event of a connection failure, only one thread will attempt to re-establish
15
+ connectivity to the Mongo server(s). This is to prevent swamping the mongo
16
+ servers with reconnect attempts.
17
+
18
+ Retries are initially performed quickly in case it is brief network issue
19
+ and then backs off to give the replica-set time to elect a new master.
20
+
21
+ Currently Only Supports Ruby Mongo driver v1.11.x
22
+
23
+ mongo_ha transparently supports MongoMapper since it uses the mongo ruby driver
24
+ that is patched by loading this gem.
25
+
26
+ Mongo Router processes will often return a connection failure on their side
27
+ as an OperationFailure. This code will also retry automatically when the router
28
+ has errors talking to a sharded cluster.
29
+
30
+ ## Mongo Cursors
31
+
32
+ Any operations that return a cursor need to be handled in your own code
33
+ since the retry cannot be handled transparently.
34
+ For example: `find` returns a cursor, whereas `find_one` is handled because
35
+ it returns the data returned rather than a cursor
36
+
37
+ Example
38
+
39
+ ```ruby
40
+ # Wrap existing cursor based calls with a retry on connection failure block
41
+ results_collection.retry_on_connection_failure do
42
+ results_collection.find({}, sort: '_id', timeout: false) do |cursor|
43
+ cursor.each do |record|
44
+ puts "Record: #{record.inspect}"
45
+ end
46
+ end
47
+ end
48
+ ```
49
+
50
+ ### Note
51
+
52
+ In the above example the block will be repeated from the _beginning_ of the
53
+ collection should a connection failure occur. Without appropriate handling it
54
+ is possible to read the same records twice.
55
+
56
+ If the collection cannot be processed twice, it may be better to just let the
57
+ `Mongo::ConnectionFailure` flow up into the application for it to deal with at
58
+ a higher level.
59
+
60
+ ## Installation
61
+
62
+ Add to Gemfile:
63
+
64
+ ```ruby
65
+ gem 'mongo_ha'
66
+ ```
67
+
68
+ Or for standalone environments
69
+
70
+ ```shell
71
+ gem install mongo_ha
72
+ ```
73
+
74
+ If you are also using SemanticLogger, place `mongo_ha` below `semantic_logger`
75
+ and/or `rails_semantic_logger` in the `Gemfile`. This way it will create a logger
76
+ just for `Mongo::MongoClient` to improve the log output during connection recovery.
77
+
78
+ ## Configuration
79
+
80
+ mongo_ha adds several new configuration options to fine tune the reconnect behavior
81
+ for any environment.
82
+
83
+ Sample mongo.yml:
84
+
85
+ ```yaml
86
+ default_options: &default_options
87
+ :w: 1
88
+ :pool_size: 5
89
+ :pool_timeout: 5
90
+ :connect_timeout: 5
91
+ :reconnect_attempts: 53
92
+ :reconnect_retry_seconds: 0.1
93
+ :reconnect_retry_multiplier: 2
94
+ :reconnect_max_retry_seconds: 5
95
+
96
+ development: &development
97
+ uri: mongodb://localhost:27017/development
98
+ options:
99
+ <<: *default_options
100
+
101
+ test:
102
+ uri: mongodb://localhost:27017/test
103
+ options:
104
+ <<: *default_options
105
+
106
+ # Sample Production Settings
107
+ production:
108
+ uri: mongodb://mongo1.site.com:27017,mongo2.site.com:27017/production
109
+ options:
110
+ <<: *default_options
111
+ :pool_size: 50
112
+ :pool_timeout: 5
113
+ ```
114
+
115
+ The following options can be specified in the Mongo configuration options
116
+ to tune the retry intervals during a connection failure
117
+
118
+ ### :reconnect_attempts
119
+
120
+ * Number of times to attempt to reconnect.
121
+ * Default: 53
122
+
123
+ ### :reconnect_retry_seconds
124
+
125
+ * Initial delay before retrying
126
+ * Default: 0.1
127
+
128
+ ### :reconnect_retry_multiplier
129
+
130
+ * Multiply delay by this number with each retry to prevent overwhelming the server
131
+ * Default: 2
132
+
133
+ ### :reconnect_max_retry_seconds
134
+
135
+ * Maximum number of seconds to wait before retrying again
136
+ * Default: 5
137
+
138
+ Using the above default values, will result in retry connects at the following intervals
139
+
140
+ 0.1 0.2 0.4 0.8 1.6 3.2 5 5 5 5 ....
141
+
142
+ ## Testing
143
+
144
+ There is really only one place to test something like `mongo_ha` and that is in
145
+ a high volume mission critical production environment.
146
+ The initial code in this gem was created over 2 years with MongoDB running in an
147
+ enterprise production environment with hundreds of connections to Mongo servers
148
+ in remote data centers across a WAN. It adds high availability to standalone
149
+ MongoDB servers, replica-sets, and sharded clusters.
150
+
151
+ ## Issues
152
+
153
+ If the following output appears after adding the above connection options:
154
+
155
+ ```shell
156
+ reconnect_attempts is not a valid option for Mongo::MongoClient
157
+ reconnect_retry_seconds is not a valid option for Mongo::MongoClient
158
+ reconnect_retry_multiplier is not a valid option for Mongo::MongoClient
159
+ reconnect_max_retry_seconds is not a valid option for Mongo::MongoClient
160
+ ```
161
+
162
+ Then the `mongo_ha` gem was not loaded prior to connecting to Mongo
data/Rakefile ADDED
@@ -0,0 +1,28 @@
1
+ require 'rake/clean'
2
+ require 'rake/testtask'
3
+
4
+ $LOAD_PATH.unshift File.expand_path("../lib", __FILE__)
5
+ require 'mongo_ha/version'
6
+
7
+ task :gem do
8
+ system "gem build mongo_ha.gemspec"
9
+ end
10
+
11
+ task :publish => :gem do
12
+ system "git tag -a v#{MongoHA::VERSION} -m 'Tagging #{MongoHA::VERSION}'"
13
+ system "git push --tags"
14
+ system "gem push mongo_ha-#{MongoHA::VERSION}.gem"
15
+ system "rm mongo_ha-#{MongoHA::VERSION}.gem"
16
+ end
17
+
18
+ desc "Run Test Suite"
19
+ task :test do
20
+ Rake::TestTask.new(:functional) do |t|
21
+ t.test_files = FileList['test/*_test.rb']
22
+ t.verbose = true
23
+ end
24
+
25
+ Rake::Task['functional'].invoke
26
+ end
27
+
28
+ task :default => :test
@@ -0,0 +1,188 @@
1
+ require 'mongo'
2
+ module MongoHA
3
+ module MongoClient
4
+ CONNECTION_RETRY_OPTS = [:reconnect_attempts, :reconnect_retry_seconds, :reconnect_retry_multiplier, :reconnect_max_retry_seconds]
5
+
6
+ # The following errors occur when mongos cannot connect to the shard
7
+ # They require a retry to resolve them
8
+ # This list was created through painful experience. Add any new ones as they are discovered
9
+ # 9001: socket exception
10
+ # Operation failed with the following exception: Unknown error - Connection reset by peer:Unknown error - Connection reset by peer
11
+ # DBClientBase::findOne: transport error
12
+ # : db assertion failure
13
+ # 8002: 8002 all servers down!
14
+ # Operation failed with the following exception: stream closed
15
+ # Operation failed with the following exception: Bad file descriptor - Bad file descriptor:Bad file descriptor - Bad file descriptor
16
+ # Failed to connect to primary node.
17
+ # 10009: ReplicaSetMonitor no master found for set: mdbb
18
+ MONGOS_CONNECTION_ERRORS = [
19
+ 'socket exception',
20
+ 'Connection reset by peer',
21
+ 'transport error',
22
+ 'db assertion failure',
23
+ '8002',
24
+ 'stream closed',
25
+ 'Bad file descriptor',
26
+ 'Failed to connect',
27
+ '10009',
28
+ 'no master found',
29
+ 'not master',
30
+ 'Timed out waiting on socket',
31
+ "didn't get writeback",
32
+ ]
33
+
34
+ module InstanceMethods
35
+ # Add retry logic to MongoClient
36
+ def self.included(base)
37
+ base.class_eval do
38
+ alias_method :receive_message_original, :receive_message
39
+ alias_method :connect_original, :connect
40
+ alias_method :valid_opts_original, :valid_opts
41
+ alias_method :setup_original, :setup
42
+
43
+ attr_accessor *CONNECTION_RETRY_OPTS
44
+
45
+ # Prevent multiple threads from trying to reconnect at the same time during
46
+ # connection failures
47
+ @@failover_mutex = Mutex.new
48
+ # Wrap internal networking calls with retry logic
49
+
50
+ # Do not stub out :send_message_with_gle or :send_message
51
+ # It modifies the message, see CollectionWriter#send_write_operation
52
+
53
+ def receive_message(*args)
54
+ retry_on_connection_failure do
55
+ receive_message_original *args
56
+ end
57
+ end
58
+
59
+ def connect(*args)
60
+ retry_on_connection_failure do
61
+ connect_original *args
62
+ end
63
+ end
64
+
65
+ protected
66
+
67
+ def valid_opts(*args)
68
+ valid_opts_original(*args) + CONNECTION_RETRY_OPTS
69
+ end
70
+
71
+ def setup(opts)
72
+ self.reconnect_attempts = (opts.delete(:reconnect_attempts) || 53).to_i
73
+ self.reconnect_retry_seconds = (opts.delete(:reconnect_retry_seconds) || 0.1).to_f
74
+ self.reconnect_retry_multiplier = (opts.delete(:reconnect_retry_multiplier) || 2).to_f
75
+ self.reconnect_max_retry_seconds = (opts.delete(:reconnect_max_retry_seconds) || 5).to_f
76
+ setup_original(opts)
77
+ end
78
+
79
+ end
80
+ end
81
+
82
+ # Retry the supplied block when a Mongo::ConnectionFailure occurs
83
+ #
84
+ # Note: Check for Duplicate Key on inserts
85
+ #
86
+ # Returns the result of the block
87
+ #
88
+ # Example:
89
+ # connection.retry_on_connection_failure { |retried| connection.ping }
90
+ def retry_on_connection_failure(&block)
91
+ raise "Missing mandatory block parameter on call to Mongo::Connection#retry_on_connection_failure" unless block
92
+ retried = false
93
+ mongos_retries = 0
94
+ begin
95
+ result = block.call(retried)
96
+ retried = false
97
+ result
98
+ rescue Mongo::ConnectionFailure => exc
99
+ # Retry if reconnected, but only once to prevent an infinite loop
100
+ logger.warn "Connection Failure: '#{exc.message}' [#{exc.error_code}]"
101
+ if !retried && reconnect
102
+ retried = true
103
+ # TODO There has to be a way to flush the connection pool of all inactive connections
104
+ retry
105
+ end
106
+ raise exc
107
+ rescue Mongo::OperationFailure => exc
108
+ # Workaround not master issue. Disconnect connection when we get a not master
109
+ # error message. Master checks for an exact match on "not master", whereas
110
+ # it sometimes gets: "not master and slaveok=false"
111
+ if exc.result
112
+ error = exc.result['err'] || exc.result['errmsg']
113
+ close if error && error.include?("not master")
114
+ end
115
+
116
+ # These get returned when connected to a local mongos router when it in turn
117
+ # has connection failures talking to the remote shards. All we do is retry the same operation
118
+ # since it's connections to multiple remote shards may have failed.
119
+ # Disconnecting the current connection will not help since it is just to the mongos router
120
+ # First make sure it is connected to the mongos router
121
+ raise exc unless (MONGOS_CONNECTION_ERRORS.any? { |err| exc.message.include?(err) }) || (exc.message.strip == ':')
122
+
123
+ mongos_retries += 1
124
+ if mongos_retries <= 60
125
+ retried = true
126
+ Kernel.sleep(0.5)
127
+ logger.warn "[#{primary.inspect}] Router Connection Failure. Retry ##{mongos_retries}. Exc: '#{exc.message}' [#{exc.error_code}]"
128
+ # TODO Is there a way to flush the connection pool of all inactive connections
129
+ retry
130
+ end
131
+ raise exc
132
+ end
133
+ end
134
+
135
+ # Call this method whenever a Mongo::ConnectionFailure Exception
136
+ # has been raised to re-establish the connection
137
+ #
138
+ # This method is thread-safe and ensure that only one thread at a time
139
+ # per connection will attempt to re-establish the connection
140
+ #
141
+ # Returns whether the connection is connected again
142
+ def reconnect
143
+ logger.debug "Going to reconnect"
144
+
145
+ # Prevent other threads from invoking reconnect logic at the same time
146
+ @@failover_mutex.synchronize do
147
+ # Another thread may have already failed over the connection by the
148
+ # time this threads gets in
149
+ if active?
150
+ logger.info "Connected to: #{primary.inspect}"
151
+ return true
152
+ end
153
+
154
+ # Close all sockets that are not checked out so that other threads not
155
+ # currently waiting on Mongo, don't get bad connections and have to
156
+ # retry each one in turn
157
+ @primary_pool.close if @primary_pool
158
+
159
+ if reconnect_attempts > 0
160
+ # Wait for other threads to finish working on their sockets
161
+ retries = 1
162
+ retry_seconds = reconnect_retry_seconds
163
+ begin
164
+ logger.warn "Connection unavailable. Waiting: #{retry_seconds} seconds before retrying"
165
+ sleep retry_seconds
166
+ # Call original connect method since it is already within a retry block
167
+ connect_original
168
+ rescue Mongo::ConnectionFailure => exc
169
+ if retries < reconnect_attempts
170
+ retries += 1
171
+ retry_seconds *= reconnect_retry_multiplier
172
+ retry_seconds = reconnect_max_retry_seconds if retry_seconds > reconnect_max_retry_seconds
173
+ retry
174
+ end
175
+
176
+ logger.error "Auto-reconnect giving up after #{retries} reconnect attempts"
177
+ raise exc
178
+ end
179
+ logger.info "Successfully reconnected to: #{primary.inspect}"
180
+ end
181
+ connected?
182
+ end
183
+
184
+ end
185
+
186
+ end
187
+ end
188
+ end
@@ -0,0 +1,58 @@
1
+ module MongoHA
2
+ module Networking
3
+ module InstanceMethods
4
+ def self.included(base)
5
+ base.class_eval do
6
+ # Fix problem where a Timeout exception is not checking the socket back into the pool
7
+ # Based on code from Gem V1.11.1, not needed with V1.12 or above
8
+ # Only change is the ensure block
9
+ def send_message_with_gle(operation, message, db_name, log_message=nil, write_concern=false)
10
+ docs = num_received = cursor_id = ''
11
+ add_message_headers(message, operation)
12
+
13
+ last_error_message = build_get_last_error_message(db_name, write_concern)
14
+ last_error_id = add_message_headers(last_error_message, Mongo::Constants::OP_QUERY)
15
+
16
+ packed_message = message.append!(last_error_message).to_s
17
+ sock = nil
18
+ begin
19
+ sock = checkout_writer
20
+ send_message_on_socket(packed_message, sock)
21
+ docs, num_received, cursor_id = receive(sock, last_error_id)
22
+ # Removed checkin
23
+ # checkin(sock)
24
+ rescue Mongo::ConnectionFailure, Mongo::OperationFailure, Mongo::OperationTimeout => ex
25
+ # Removed checkin
26
+ # checkin(sock)
27
+ raise ex
28
+ rescue SystemStackError, NoMemoryError, SystemCallError => ex
29
+ close
30
+ raise ex
31
+ # Added ensure block to always check sock back in
32
+ ensure
33
+ checkin(sock) if sock
34
+ end
35
+
36
+ if num_received == 1
37
+ error = docs[0]['err'] || docs[0]['errmsg']
38
+ if error && error.include?("not master")
39
+ close
40
+ raise Mongo::ConnectionFailure.new(docs[0]['code'].to_s + ': ' + error, docs[0]['code'], docs[0])
41
+ elsif (!error.nil? && note = docs[0]['jnote'] || docs[0]['wnote']) # assignment
42
+ code = docs[0]['code'] || Mongo::ErrorCode::BAD_VALUE # as of server version 2.5.5
43
+ raise Mongo::WriteConcernError.new(code.to_s + ': ' + note, code, docs[0])
44
+ elsif error
45
+ code = docs[0]['code'] || Mongo::ErrorCode::UNKNOWN_ERROR
46
+ error = "wtimeout" if error == "timeout"
47
+ raise Mongo::WriteConcernError.new(code.to_s + ': ' + error, code, docs[0]) if error == "wtimeout"
48
+ raise Mongo::OperationFailure.new(code.to_s + ': ' + error, code, docs[0])
49
+ end
50
+ end
51
+
52
+ docs[0]
53
+ end
54
+ end
55
+ end
56
+ end
57
+ end
58
+ end
@@ -0,0 +1,3 @@
1
+ module MongoHA #:nodoc
2
+ VERSION = "1.11.0.rc1"
3
+ end
data/lib/mongo_ha.rb ADDED
@@ -0,0 +1,38 @@
1
+ require 'mongo'
2
+ require 'mongo_ha/version'
3
+ require 'mongo_ha/mongo_client'
4
+ require 'mongo_ha/networking'
5
+
6
+ # Give MongoClient a class-specific logger if SemanticLogger is available
7
+ # to give better logging information during a connection recovery scenario
8
+ if defined?(SemanticLogger)
9
+ Mongo::MongoClient.send(:include, SemanticLogger::Loggable)
10
+ Mongo::MongoClient.send(:define_method, :logger) { super() }
11
+ end
12
+
13
+ # Add in retry methods
14
+ Mongo::MongoClient.include(MongoHA::MongoClient::InstanceMethods)
15
+
16
+ # Ensure connection is checked back into the pool when exceptions are thrown
17
+ # The following line is no longer required with Mongo V1.12 and above
18
+ Mongo::Networking.include(MongoHA::Networking::InstanceMethods)
19
+
20
+ # Wrap critical Mongo methods with retry_on_connection_failure
21
+ {
22
+ Mongo::Collection => [
23
+ :aggregate, :count, :capped?, :distinct, :drop, :drop_index, :drop_indexes,
24
+ :ensure_index, :find_one, :find_and_modify, :group, :index_information,
25
+ :options, :stats, :map_reduce
26
+ ],
27
+ Mongo::CollectionOperationWriter => [:send_write_operation, :batch_message_send],
28
+ Mongo::CollectionCommandWriter => [:send_write_command, :batch_message_send]
29
+
30
+ }.each_pair do |klass, methods|
31
+ methods.each do |method|
32
+ original_method = "#{method}_original".to_sym
33
+ klass.send(:alias_method, original_method, method)
34
+ klass.send(:define_method, method) do |*args|
35
+ @connection.retry_on_connection_failure { send(original_method, *args) }
36
+ end
37
+ end
38
+ end
metadata ADDED
@@ -0,0 +1,65 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: mongo_ha
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.11.0.rc1
5
+ platform: ruby
6
+ authors:
7
+ - Reid Morrison
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2015-01-01 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: mongo
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: 1.11.0
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: 1.11.0
27
+ description: Automatic reconnects and recovery when replica-set changes, or connections
28
+ are lost, with transparent recovery
29
+ email:
30
+ - reidmo@gmail.com
31
+ executables: []
32
+ extensions: []
33
+ extra_rdoc_files: []
34
+ files:
35
+ - README.md
36
+ - Rakefile
37
+ - lib/mongo_ha.rb
38
+ - lib/mongo_ha/mongo_client.rb
39
+ - lib/mongo_ha/networking.rb
40
+ - lib/mongo_ha/version.rb
41
+ homepage: https://github.com/reidmorrison/mongo_ha
42
+ licenses:
43
+ - Apache License V2.0
44
+ metadata: {}
45
+ post_install_message:
46
+ rdoc_options: []
47
+ require_paths:
48
+ - lib
49
+ required_ruby_version: !ruby/object:Gem::Requirement
50
+ requirements:
51
+ - - ">="
52
+ - !ruby/object:Gem::Version
53
+ version: '0'
54
+ required_rubygems_version: !ruby/object:Gem::Requirement
55
+ requirements:
56
+ - - ">"
57
+ - !ruby/object:Gem::Version
58
+ version: 1.3.1
59
+ requirements: []
60
+ rubyforge_project:
61
+ rubygems_version: 2.4.5
62
+ signing_key:
63
+ specification_version: 4
64
+ summary: High availability for the mongo ruby driver
65
+ test_files: []