moped 1.0.0.alpha → 1.0.0.beta

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of moped might be problematic. Click here for more details.

data/README.md CHANGED
@@ -15,6 +15,21 @@ session[:artists].find(name: "Syd Vicious").
15
15
  )
16
16
  ```
17
17
 
18
+ ## Features
19
+
20
+ * Automated replica set node discovery and failover.
21
+ * No C or Java extensions
22
+ * No external dependencies
23
+ * Simple, stable, public API.
24
+
25
+ ### Unsupported Features
26
+
27
+ * GridFS
28
+ * Map/Reduce
29
+
30
+ These features are possible to implement, but outside the scope of Moped's
31
+ goals. Consider them perfect opportunities to write a companion gem!
32
+
18
33
  # Project Breakdown
19
34
 
20
35
  Moped is composed of three parts: an implementation of the [BSON
@@ -43,6 +58,31 @@ id.generation_time # => 2012-04-11 13:14:29 UTC
43
58
  id == Moped::BSON::ObjectId.from_string(id.to_s) # => true
44
59
  ```
45
60
 
61
+ <table><tbody>
62
+
63
+ <tr><th>new</th>
64
+ <td>Creates a new object id.</td></tr>
65
+
66
+ <tr><th>from_string</th>
67
+ <td>Creates a new object id from an object id string.
68
+ <br>
69
+ <code>Moped::BSON::ObjectId.from_string("4f8d8c66e5a4e45396000009")</code>
70
+ </td></tr>
71
+
72
+ <tr><th>from_time</th>
73
+ <td>Creates a new object id from a time.
74
+ <br>
75
+ <code>Moped::BSON::ObjectId.from_time(Time.new)</code>
76
+ </td></tr>
77
+
78
+ <tr><th>legal?</th>
79
+ <td>Validates an object id string.
80
+ <br>
81
+ <code>Moped::BSON::ObjectId.legal?("4f8d8c66e5a4e45396000009")</code>
82
+ </td></tr>
83
+
84
+ </tbody></table>
85
+
46
86
  ### Moped::BSON::Code
47
87
 
48
88
  The `Code` class is used for working with javascript on the server.
@@ -299,6 +339,126 @@ scope.one # nil
299
339
 
300
340
  </tbody></table>
301
341
 
342
+ # Exceptions
343
+
344
+ Here's a list of the exceptions generated by Moped.
345
+
346
+ <table><tbody>
347
+
348
+ <tr><th>Moped::Errors::ConnectionFailure</th>
349
+ <td>Raised when a node cannot be reached or a connection is lost.
350
+ <br>
351
+ <strong>Note:</strong> this exception is only raised if Moped could not
352
+ reconnect, so you shouldn't attempt to rescue this.</td></tr>
353
+
354
+ <tr><th>Moped::Errors::OperationFailure</th>
355
+ <td>Raised when a command fails or is invalid, such as when an insert fails in
356
+ safe mode.</td></tr>
357
+
358
+ <tr><th>Moped::Errors::QueryFailure</th>
359
+ <td>Raised when an invalid query was sent to the database.</td></tr>
360
+
361
+ <tr><th>Moped::Errors::AuthenticationFailure</th>
362
+ <td>Raised when invalid credentials were passed to `session.login`.</td></tr>
363
+
364
+ <tr><th>Moped::Errors::SocketError</th>
365
+ <td>Not a real exception, but a module used to tag unhandled exceptions inside
366
+ of a node's networking code. Allows you to `rescue Moped::SocketError` which
367
+ preserving the real exception.</td></tr>
368
+
369
+ </tbody></table>
370
+
371
+ Other exceptions are possible while running commands, such as IO Errors around
372
+ failed connections. Moped tries to be smart about managing its connections,
373
+ such as checking if they're dead before executing a command; but those checks
374
+ aren't foolproof, and Moped is conservative about handling unexpected errors on
375
+ its connections. Namely, Moped will *not* retry a command if an unexpected
376
+ exception is raised. Why? Because it's impossible to know whether the command
377
+ was actually received by the remote Mongo instance, and without domain
378
+ knowledge it cannot be safely retried.
379
+
380
+ Take for example this case:
381
+
382
+ ```ruby
383
+ session.with(safe: true)["users"].insert(name: "John")
384
+ ```
385
+
386
+ It's entirely possible that the insert command will be sent to Mongo, but the
387
+ connection gets closed before we read the result for `getLastError`. In this
388
+ case, there's no way to know whether the insert was actually successful!
389
+
390
+ If, however, you want to gracefully handle this in your own application, you
391
+ could do something like:
392
+
393
+ ```ruby
394
+ document = { _id: Moped::BSON::ObjectId.new, name: "John" }
395
+
396
+ begin
397
+ session["users"].insert(document)
398
+ rescue Moped::Errors::SocketError
399
+ session["users"].find(_id: document[:_id]).upsert(document)
400
+ end
401
+ ```
402
+
403
+ # Replica Sets
404
+
405
+ Moped has full support for replica sets including automatic failover and node
406
+ discovery.
407
+
408
+ ## Automatic Failover
409
+
410
+ Moped will automatically retry lost connections and attempt to detect dead
411
+ connections before sending an operation. Note, that it will *not* retry
412
+ individual operations! For example, these cases will work and not raise any
413
+ exceptions:
414
+
415
+ ```ruby
416
+ session[:users].insert(name: "John")
417
+ # kill primary node and promote secondary
418
+ session[:users].insert(name: "John")
419
+ session[:users].find.count # => 2.0
420
+
421
+ # primary node drops our connection
422
+ session[:users].insert(name: "John")
423
+ ```
424
+
425
+ However, you'll get an operation error in a case like:
426
+
427
+ ```ruby
428
+ # primary node goes down while reading the reply
429
+ session.with(safe: true)[:users].insert(name: "John")
430
+ ```
431
+
432
+ And you'll get a connection error in a case like:
433
+
434
+ ```ruby
435
+ # primary node goes down, no new primary available yet
436
+ session[:users].insert(name: "John")
437
+ ```
438
+
439
+ If your session is running with eventual consistency, read operations will
440
+ never raise connection errors as long as any secondary or primary node is
441
+ running. The only case where you'll see a connection failure is if a node goes
442
+ down while attempting to retrieve more results from a cursor, because cursors
443
+ are tied to individual nodes.
444
+
445
+ When two attempts to connect to a node fail, it will be marked as down. This
446
+ removes it from the list of available nodes for `:down_interval` (default 30
447
+ seconds). Note that the `:down_interval` only applies to normal operations;
448
+ that is, if you ask for a primary node and none is available, all nodes will be
449
+ retried. Likewise, if you ask for a secondary node, and no secondary or primary
450
+ node is available, all nodes will be retreied.
451
+
452
+ ## Node Discovery
453
+
454
+ The addresses you pass into your session are used as seeds for setting up
455
+ replica set connections. After connection, each seed node will return a list of
456
+ other known nodes which will be added to the set.
457
+
458
+ This information is cached according to the `:refresh_interval` option (default:
459
+ 5 minutes). That means, e.g., that if you add a new node to your replica set,
460
+ it should be represented in Moped within 5 minutes.
461
+
302
462
  # Thread-Safety
303
463
 
304
464
  Moped is thread-safe -- depending on your definition of thread-safe. For Moped,
@@ -6,14 +6,16 @@ require "forwardable"
6
6
  require "moped/bson"
7
7
  require "moped/cluster"
8
8
  require "moped/collection"
9
+ require "moped/connection"
9
10
  require "moped/cursor"
10
11
  require "moped/database"
11
12
  require "moped/errors"
12
13
  require "moped/indexes"
13
14
  require "moped/logging"
15
+ require "moped/node"
14
16
  require "moped/protocol"
15
17
  require "moped/query"
16
- require "moped/server"
17
18
  require "moped/session"
18
- require "moped/socket"
19
+ require "moped/session/context"
20
+ require "moped/threaded"
19
21
  require "moped/version"
@@ -8,30 +8,33 @@ module Moped
8
8
  # Formatting string for outputting an ObjectId.
9
9
  @@string_format = ("%02x" * 12).freeze
10
10
 
11
- attr_reader :data
12
-
13
11
  class << self
14
12
  def from_string(string)
13
+ raise Errors::InvalidObjectId.new(string) unless legal?(string)
15
14
  data = []
16
15
  12.times { |i| data << string[i*2, 2].to_i(16) }
17
- new data
16
+ from_data data.pack("C12")
17
+ end
18
+
19
+ def from_time(time)
20
+ from_data @@generator.generate(time.to_i)
18
21
  end
19
22
 
20
23
  def legal?(str)
21
- !!str.match(/^[0-9a-f]{24}$/i)
24
+ !!str.match(/\A\h{24}\Z/i)
22
25
  end
23
- end
24
26
 
25
- def initialize(data = nil, time = nil)
26
- if data
27
- @data = data
28
- elsif time
29
- @data = @@generator.generate(time.to_i)
30
- else
31
- @data = @@generator.next
27
+ def from_data(data)
28
+ id = allocate
29
+ id.instance_variable_set :@data, data
30
+ id
32
31
  end
33
32
  end
34
33
 
34
+ def data
35
+ @data ||= @@generator.next
36
+ end
37
+
35
38
  def ==(other)
36
39
  BSON::ObjectId === other && data == other.data
37
40
  end
@@ -42,28 +45,27 @@ module Moped
42
45
  end
43
46
 
44
47
  def to_s
45
- @@string_format % data
48
+ @@string_format % data.unpack("C12")
46
49
  end
47
50
 
48
51
  # Return the UTC time at which this ObjectId was generated. This may
49
52
  # be used instread of a created_at timestamp since this information
50
53
  # is always encoded in the object id.
51
54
  def generation_time
52
- Time.at(@data.pack("C4").unpack("N")[0]).utc
55
+ Time.at(data.unpack("N")[0]).utc
53
56
  end
54
57
 
55
58
  class << self
56
59
  def __bson_load__(io)
57
- new io.read(12).unpack('C*')
60
+ from_data(io.read(12))
58
61
  end
59
-
60
62
  end
61
63
 
62
64
  def __bson_dump__(io, key)
63
65
  io << Types::OBJECT_ID
64
66
  io << key
65
67
  io << NULL_BYTE
66
- io << data.pack('C12')
68
+ io << data
67
69
  end
68
70
 
69
71
  # @api private
@@ -71,49 +73,28 @@ module Moped
71
73
  def initialize
72
74
  # Generate and cache 3 bytes of identifying information from the current
73
75
  # machine.
74
- @machine_id = Digest::MD5.digest(Socket.gethostname).unpack("C3")
76
+ @machine_id = Digest::MD5.digest(Socket.gethostname).unpack("N")[0]
75
77
 
76
78
  @mutex = Mutex.new
77
- @last_timestamp = nil
78
79
  @counter = 0
79
80
  end
80
81
 
81
- # Return object id data based on the current time, incrementing a
82
- # counter for object ids generated in the same second.
82
+ # Return object id data based on the current time, incrementing the
83
+ # object id counter.
83
84
  def next
84
- now = Time.new.to_i
85
-
86
- counter = @mutex.synchronize do
87
- last_timestamp, @last_timestamp = @last_timestamp, now
88
-
89
- if last_timestamp == now
90
- @counter += 1
91
- else
92
- @counter = 0
93
- end
85
+ @mutex.lock
86
+ begin
87
+ counter = @counter = (@counter + 1) % 0xFFFFFF
88
+ ensure
89
+ @mutex.unlock rescue nil
94
90
  end
95
91
 
96
- generate(now, counter)
92
+ generate(Time.new.to_i, counter)
97
93
  end
98
94
 
99
- # Generate object id data for a given time using the provided +inc+.
100
- def generate(time, inc = 0)
101
- pid = Process.pid % 0xFFFF
102
-
103
- [
104
- time >> 24 & 0xFF, # 4 bytes time (network order)
105
- time >> 16 & 0xFF,
106
- time >> 8 & 0xFF,
107
- time & 0xFF,
108
- @machine_id[0], # 3 bytes machine
109
- @machine_id[1],
110
- @machine_id[2],
111
- pid >> 8 & 0xFF, # 2 bytes process id
112
- pid & 0xFF,
113
- inc >> 16 & 0xFF, # 3 bytes increment
114
- inc >> 8 & 0xFF,
115
- inc & 0xFF,
116
- ]
95
+ # Generate object id data for a given time using the provided +counter+.
96
+ def generate(time, counter = 0)
97
+ [time, @machine_id, Process.pid, counter << 8].pack("N NX lXX NX")
117
98
  end
118
99
  end
119
100
 
@@ -1,173 +1,133 @@
1
1
  module Moped
2
2
 
3
- # @api private
4
- #
5
- # The internal class managing connections to both a single node and replica
6
- # sets.
7
- #
8
- # @note Though the socket class itself *is* threadsafe, the cluster presently
9
- # is not. This means that in the course of normal operations sessions can be
10
- # shared across threads, but in failure modes (when a resync is required),
11
- # things can possibly go wrong.
12
3
  class Cluster
13
4
 
14
- # @return [Array] the user supplied seeds
5
+ # @return [Array<String>] the seeds the replica set was initialized with
15
6
  attr_reader :seeds
16
7
 
17
- # @return [Boolean] whether this is a direct connection
18
- attr_reader :direct
19
-
20
- # @return [Array] all available nodes
21
- attr_reader :servers
22
-
23
- # @return [Array] seeds gathered from cluster discovery
24
- attr_reader :dynamic_seeds
25
-
26
- # @param [Array] seeds an array of host:port pairs
27
- # @param [Boolean] direct (false) whether to connect directly to the hosts
28
- # provided or to find additional available nodes.
29
- def initialize(seeds, direct = false)
30
- @seeds = seeds
31
- @direct = direct
32
-
33
- @servers = []
34
- @dynamic_seeds = []
35
- end
36
-
37
- # @return [Array] available secondary nodes
38
- def secondaries
39
- servers.select(&:secondary?)
40
- end
41
-
42
- # @return [Array] available primary nodes
43
- def primaries
44
- servers.select(&:primary?)
45
- end
46
-
47
- # @return [Array] all known addresses from user supplied seeds, dynamically
48
- # discovered seeds, and active servers.
49
- def known_addresses
50
- [].tap do |addresses|
51
- addresses.concat seeds
52
- addresses.concat dynamic_seeds
53
- addresses.concat servers.map { |server| server.address }
54
- end.uniq
55
- end
56
-
57
- def remove(server)
58
- servers.delete(server)
59
- end
60
-
61
- def reconnect
62
- @servers = servers.map { |server| Server.new(server.address) }
8
+ # @option options :down_interval number of seconds to wait before attempting
9
+ # to reconnect to a down node. (30)
10
+ #
11
+ # @option options :refresh_interval number of seconds to cache information
12
+ # about a node. (300)
13
+ def initialize(hosts, options)
14
+ @options = {
15
+ down_interval: 30,
16
+ refresh_interval: 300
17
+ }.merge(options)
18
+
19
+ @seeds = hosts
20
+ @nodes = hosts.map { |host| Node.new(host) }
63
21
  end
64
22
 
65
- def sync
66
- known = known_addresses.shuffle
67
- seen = {}
68
-
69
- sync_seed = ->(seed) do
70
- server = Server.new seed
71
-
72
- unless seen[server.resolved_address]
73
- seen[server.resolved_address] = true
74
-
75
- hosts = sync_server(server)
76
-
77
- hosts.each do |host|
78
- sync_seed[host]
23
+ # Refreshes information for each of the nodes provided. The node list
24
+ # defaults to the list of all known nodes.
25
+ #
26
+ # If a node is successfully refreshed, any newly discovered peers will also
27
+ # be refreshed.
28
+ #
29
+ # @return [Array<Node>] the available nodes
30
+ def refresh(nodes_to_refresh = @nodes)
31
+ refreshed_nodes = []
32
+ seen = {}
33
+
34
+ # Set up a recursive lambda function for refreshing a node and it's peers.
35
+ refresh_node = ->(node) do
36
+ unless seen[node]
37
+ seen[node] = true
38
+
39
+ # Add the node to the global list of known nodes.
40
+ @nodes << node unless @nodes.include?(node)
41
+
42
+ begin
43
+ node.refresh
44
+
45
+ # This node is good, so add it to the list of nodes to return.
46
+ refreshed_nodes << node unless refreshed_nodes.include?(node)
47
+
48
+ # Now refresh any newly discovered peer nodes.
49
+ (node.peers - @nodes).each &refresh_node
50
+ rescue Errors::ConnectionFailure
51
+ # We couldn't connect to the node, so don't do anything with it.
79
52
  end
80
53
  end
81
54
  end
82
55
 
83
- known.each do |seed|
84
- sync_seed[seed]
85
- end
56
+ nodes_to_refresh.each &refresh_node
86
57
 
87
- unless servers.empty?
88
- @dynamic_seeds = servers.map(&:address)
89
- end
90
-
91
- true
58
+ refreshed_nodes.to_a
92
59
  end
93
60
 
94
- def sync_server(server)
95
- [].tap do |hosts|
96
- socket = server.socket
97
-
98
- if socket.connect
99
- info = socket.simple_query Protocol::Command.new(:admin, ismaster: 1)
100
-
101
- if info["ismaster"]
102
- server.primary = true
103
- end
104
-
105
- if info["secondary"]
106
- server.secondary = true
107
- end
61
+ # Returns the list of available nodes, refreshing 1) any nodes which were
62
+ # down and ready to be checked again and 2) any nodes whose information is
63
+ # out of date.
64
+ #
65
+ # @return [Array<Node>] the list of available nodes.
66
+ def nodes
67
+ # Find the nodes that were down but are ready to be refreshed, or those
68
+ # with stale connection information.
69
+ needs_refresh, available = @nodes.partition do |node|
70
+ (node.down? && node.down_at < (Time.new - @options[:down_interval])) ||
71
+ node.needs_refresh?(Time.new - @options[:refresh_interval])
72
+ end
108
73
 
109
- if info["primary"]
110
- hosts.push info["primary"]
111
- end
74
+ # Refresh those nodes.
75
+ available.concat refresh(needs_refresh)
112
76
 
113
- if info["hosts"]
114
- hosts.concat info["hosts"]
115
- end
77
+ # Now return all the nodes that are available.
78
+ available.reject &:down?
79
+ end
116
80
 
117
- if info["passives"]
118
- hosts.concat info["passives"]
81
+ # Yields the replica set's primary node to the provided block. This method
82
+ # will retry the block in case of connection errors or replica set
83
+ # reconfiguration.
84
+ #
85
+ # @raises ConnectionFailure when no primary node can be found
86
+ def with_primary(retry_on_failure = true, &block)
87
+ if node = nodes.find(&:primary?)
88
+ begin
89
+ node.ensure_primary do
90
+ return yield node.apply_auth(auth)
119
91
  end
120
-
121
- merge(server)
122
-
92
+ rescue Errors::ConnectionFailure, Errors::ReplicaSetReconfigured
93
+ # Fall through to the code below if our connection was dropped or the
94
+ # node is no longer the primary.
123
95
  end
124
- end.uniq
125
- end
126
-
127
- def merge(server)
128
- previous = servers.find { |other| other == server }
129
- primary = server.primary?
130
- secondary = server.secondary?
96
+ end
131
97
 
132
- if previous
133
- previous.merge(server)
98
+ if retry_on_failure
99
+ # We couldn't find a primary node, so refresh the list and try again.
100
+ refresh
101
+ with_primary(false, &block)
134
102
  else
135
- servers << server
103
+ raise Errors::ConnectionFailure, "Could not connect to a primary node for replica set #{inspect}"
136
104
  end
137
105
  end
138
106
 
139
- # @param [:read, :write] mode the type of socket to return
140
- # @return [Socket] a socket valid for +mode+ operations
141
- def socket_for(mode)
142
- sync unless primaries.any? || (secondaries.any? && mode == :read)
143
-
144
- server = nil
145
- while primaries.any? || (secondaries.any? && mode == :read)
146
- if mode == :write || secondaries.empty?
147
- server = primaries.sample
148
- else
149
- server = secondaries.sample
150
- end
151
-
152
- if server
153
- socket = server.socket
154
- socket.connect unless socket.connection
155
-
156
- if socket.alive?
157
- break server
158
- else
159
- remove server
160
- end
107
+ # Yields a secondary node if available, otherwise the primary node. This
108
+ # method will retry the block in case of connection errors.
109
+ #
110
+ # @raises ConnectionError when no secondary or primary node can be found
111
+ def with_secondary(retry_on_failure = true, &block)
112
+ available_nodes = nodes.shuffle!.partition(&:secondary?).flatten
113
+
114
+ while node = available_nodes.shift
115
+ begin
116
+ return yield node.apply_auth(auth)
117
+ rescue Errors::ConnectionFailure
118
+ # That node's no good, so let's try the next one.
119
+ next
161
120
  end
162
121
  end
163
122
 
164
- unless server
165
- raise Errors::ConnectionFailure.new("Could not connect to any primary or secondary servers")
123
+ if retry_on_failure
124
+ # We couldn't find a secondary or primary node, so refresh the list and
125
+ # try again.
126
+ refresh
127
+ with_secondary(false, &block)
128
+ else
129
+ raise Errors::ConnectionFailure, "Could not connect to any secondary or primary nodes for replica set #{inspect}"
166
130
  end
167
-
168
- socket = server.socket
169
- socket.apply_auth auth
170
- socket
171
131
  end
172
132
 
173
133
  # @return [Hash] the cached authentication credentials for this cluster.
@@ -175,19 +135,11 @@ module Moped
175
135
  @auth ||= {}
176
136
  end
177
137
 
178
- # Log in to +database+ with +username+ and +password+. Does not perform the
179
- # actual log in, but saves the credentials for later authentication on a
180
- # socket.
181
- def login(database, username, password)
182
- auth[database.to_s] = [username, password]
183
- end
138
+ private
184
139
 
185
- # Log out of +database+. Does not perform the actual log out, but will log
186
- # out when the socket is used next.
187
- def logout(database)
188
- auth.delete(database.to_s)
140
+ def initialize_copy(_)
141
+ @nodes = @nodes.map &:dup
189
142
  end
190
143
 
191
144
  end
192
-
193
145
  end