mongo-oplog-backup 0.0.6 → 0.0.7

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 58c9a1ccd7bbe033e5ebd295015a4b0f092d9525
4
- data.tar.gz: d7a0ccf762945fc2ffafbbd3f24955397150928e
3
+ metadata.gz: 41408b43d702bfbe7c3b4ea5ff0d46c89893d6e9
4
+ data.tar.gz: cb54777ed47f5c31f1bd9aef664397b59f19806b
5
5
  SHA512:
6
- metadata.gz: 8f62c0d4cdeac23d1e223424f5d5a930c81d4317c8acb5282d1e44b4a3ddb6453a4c7d0627f72b09c22f38f0d505cb0c6dcf89e10aae2b9bae465689ba4d6a55
7
- data.tar.gz: 4dcbd7c51dc11cb87b0cd9e20a7726ee5e629b7692c330037d1ae5b027d59268ee0afb9b19b18a866163bbf7eadad4f77d8eca4766ea64aa803fad0539fc5462
6
+ metadata.gz: ec83326fefad756510b66faf9fd8b88e268f140ba447c803bc3898cc1345565c9d58aaab71877916d3272bf69e797bcdbd66ee34b6d2bf8c25fdd066aeebde46
7
+ data.tar.gz: 7674468a302f77ce3b0039b9e53d6182c09f810542b4b76cb7c14a302f7c08051de6d53d9cfc67a8144bc67e657c92c8d7064dbb2e7c233dc41fab59dc46df01
data/README.md CHANGED
@@ -4,8 +4,39 @@
4
4
 
5
5
  **Not ready for any important data yet. Use at your own risk.**
6
6
 
7
+ ## Introduction
8
+
9
+ This project aims to enable incremental backups with point-in-time restore
10
+ functionality, utilizing MongoDB's oplog and standard tools wherever possible.
11
+
12
+ A backup script can be run from a cron job, and each incremental run produces
13
+ a single file that can be stored on your preferred medium, for example Amazon S3
14
+ or an FTP site. This project only provides the tools to produce the backup files,
15
+ and it's up to you to transfer it to a backup medium.
16
+
17
+ Interally the `mongodump` command is used for the backup operations. Initially
18
+ a full dump is performed, after which incremetal backups are performed by backing
19
+ up new sections of the oplog. Only the standard BSON format from mongodump is used.
20
+
21
+ To restore a backup, the incremental oplogs are merged into a single file and combined
22
+ with the initial full dump, which can then be restored with a standard
23
+ `mongorestore --oplogReplay` command. A point-in-time restore with the `--oplogLimit`
24
+ option of `mongorestore`. Additional support for this may be added to the
25
+ oplog merging command in the future to simplify the process.
26
+
27
+ Incremental oplogs always overlap by exactly one entry, so that integrity can easily
28
+ be verified (e.g. that there are no gaps between incremental oplogs).
29
+
30
+
31
+
7
32
  ## Installation
8
33
 
34
+ Install released gem (recommended):
35
+
36
+ gem install mongo-oplog-backup
37
+
38
+ Install latest development version:
39
+
9
40
  git clone git@github.com:journeyapps/mongo-oplog-backup.git
10
41
  cd mongo-oplog-backup
11
42
  rake install
@@ -19,23 +50,39 @@ To backup from localhost to the `mybackup` directory.
19
50
  The first run will perform a full backup. Subsequent runs will backup any new entries from the oplog.
20
51
  A full backup can be forced with the `--full` option.
21
52
 
22
- For connection options, see `mongo-oplog-backup backup --help`.
53
+ Sample cron script to perform incremental backups every 15 minutes, and a full backup once a week at 00:05:
54
+
55
+ 0,15,30,45 * * * * /path/to/ruby/bin/mongo-oplog-backup backup --dir /path/to/backup/location --oplog >> /path/to/backup.log
56
+ 5 0 * * 1 /path/to/ruby/bin/mongo-oplog-backup backup --dir /path/to/backup/location --full >> /path/to/backup.log
57
+
58
+ It is recommended to do a full backup every few days. The restore process may
59
+ be very inefficient if the oplogs grow larger than a full backup.
60
+
61
+ For connection and authentication options, see `mongo-oplog-backup backup --help`.
62
+
63
+ The backup commands work on a live server. The initial dump with oplog replay relies
64
+ on the idempotency of the oplog to have a consistent snapshot, similar to `mongodump --oplog`.
65
+ That said, there have been bugs in the past that caused the oplog to not be idempotent
66
+ in some edge cases. Therefore it is recommended to stop the secondary before performing
67
+ a full backup.
23
68
 
24
69
  ## To restore
25
70
 
26
71
  mongo-oplog-backup merge --dir mybackup/backup-<timestamp>
27
-
72
+
28
73
  The above command merges the individual oplog backups into `mybackup/backup-<timestamp>/dump/oplog.bson`.
29
74
  This allows you to restore the backup with the `mongorestore` command:
30
75
 
31
76
  mongorestore --drop --oplogReplay backup/backup-<timestamp>/dump
32
-
33
77
 
34
78
  ## Backup structure
35
79
 
36
80
  * `backup.json` - Stores the current state (oplog timestamp and backup folder).
37
81
  The only file required to perform incremental backups. It is not used for restoring a backup.
82
+ * `backup.lock` - Lock file to prevent two full backups from running concurrently.
38
83
  * `backup-<timestamp>` - The current backup folder.
84
+ * `backup.lock` - Lock file preventing two backups running concurrently in this folder.
85
+ * `status.json` - backup status (oplog timestamp)
39
86
  * `dump` - a full mongodump
40
87
  * `oplog-<start>-<end>.bson` - The oplog from the start timestamp until the end timestamp (inclusive).
41
88
 
@@ -12,7 +12,6 @@ opts = Slop.parse(help: true, strict: true) do
12
12
  on :d, :dir, "Directory to store backup files. Defaults to 'backup'.", argument: :required
13
13
  on :full, 'Force full backup'
14
14
  on :oplog, 'Force oplog backup'
15
- on :'if-not-busy', 'Do nothing when another backup is busy running.'
16
15
 
17
16
  on :f, :file, 'Configuration file for common defaults', argument: :required
18
17
  on :ssl, "Connect to a mongod instance over an SSL connection"
@@ -41,7 +40,7 @@ opts = Slop.parse(help: true, strict: true) do
41
40
  end
42
41
  config = MongoOplogBackup::Config.new(config_opts)
43
42
  backup = MongoOplogBackup::Backup.new(config)
44
- backup.perform(mode, if_not_busy: opts.if_not_busy?)
43
+ backup.perform(mode)
45
44
  end
46
45
  end
47
46
 
@@ -7,14 +7,35 @@ end
7
7
 
8
8
  module MongoOplogBackup
9
9
  class Backup
10
- attr_reader :config
10
+ attr_reader :config, :backup_name
11
11
 
12
- def initialize(config)
12
+ def backup_folder
13
+ return nil unless backup_name
14
+ File.join(config.backup_dir, backup_name)
15
+ end
16
+
17
+ def state_file
18
+ File.join(backup_folder, 'state.json')
19
+ end
20
+
21
+ def initialize(config, backup_name=nil)
13
22
  @config = config
23
+ @backup_name = backup_name
24
+ if backup_name.nil?
25
+ state_file = config.global_state_file
26
+ state = JSON.parse(File.read(state_file)) rescue nil
27
+ state ||= {}
28
+ @backup_name = state['backup']
29
+ end
30
+ end
31
+
32
+ def write_state(state)
33
+ File.write(state_file, state.to_json)
14
34
  end
15
35
 
16
36
  def lock(lockname, &block)
17
37
  File.open(lockname, File::RDWR|File::CREAT, 0644) do |file|
38
+ # Get a non-blocking lock
18
39
  got_lock = file.flock(File::LOCK_EX|File::LOCK_NB)
19
40
  if got_lock == false
20
41
  raise LockError, "Failed to acquire lock - another backup may be busy"
@@ -24,16 +45,14 @@ module MongoOplogBackup
24
45
  end
25
46
 
26
47
  def backup_oplog(options={})
27
- start_at = options[:start]
28
- backup = options[:backup]
29
- raise ArgumentError, ":backup is required" unless backup
48
+ raise ArgumentError, "No state in #{backup_name}" unless File.exists? state_file
49
+
50
+ backup_state = JSON.parse(File.read(state_file))
51
+ start_at = options[:start] || BSON::Timestamp.from_json(backup_state['position'])
30
52
  raise ArgumentError, ":start is required" unless start_at
31
53
 
32
- if start_at
33
- query = ['--query', "{ts : { $gte : { $timestamp : { t : #{start_at.seconds}, i : #{start_at.increment} } } }}"]
34
- else
35
- query = []
36
- end
54
+ query = ['--query', "{ts : { $gte : { $timestamp : { t : #{start_at.seconds}, i : #{start_at.increment} } } }}"]
55
+
37
56
  config.mongodump(['--out', config.oplog_dump_folder,
38
57
  '--db', 'local', '--collection', 'oplog.rs'] +
39
58
  query)
@@ -70,10 +89,13 @@ module MongoOplogBackup
70
89
  result[:empty] = true
71
90
  else
72
91
  outfile = "oplog-#{first}-#{last}.bson"
73
- full_path = File.join(config.backup_dir, backup, outfile)
74
- FileUtils.mkdir_p File.join(config.backup_dir, backup)
92
+ full_path = File.join(backup_folder, outfile)
93
+ FileUtils.mkdir_p backup_folder
75
94
  FileUtils.mv config.oplog_dump, full_path
76
95
 
96
+ write_state({
97
+ 'position' => result[:position]
98
+ })
77
99
  result[:file] = full_path
78
100
  result[:empty] = false
79
101
  end
@@ -97,12 +119,27 @@ module MongoOplogBackup
97
119
  def backup_full
98
120
  position = latest_oplog_timestamp
99
121
  raise "Cannot backup with empty oplog" if position.nil?
100
- backup_name = "backup-#{position}"
101
- dump_folder = File.join(config.backup_dir, backup_name, 'dump')
102
- config.mongodump('--out', dump_folder)
122
+ @backup_name = "backup-#{position}"
123
+ if File.exists? backup_folder
124
+ raise "Backup folder '#{backup_folder}' already exists; not performing backup."
125
+ end
126
+ dump_folder = File.join(backup_folder, 'dump')
127
+ result = config.mongodump('--out', dump_folder)
103
128
  unless File.directory? dump_folder
129
+ MongoOplogBackup.log.error 'Backup folder does not exist'
104
130
  raise 'Full backup failed'
105
131
  end
132
+
133
+ File.write(File.join(dump_folder, 'debug.log'), result.standard_output)
134
+
135
+ unless result.standard_error.length == 0
136
+ File.write(File.join(dump_folder, 'error.log'), result.standard_error)
137
+ end
138
+
139
+ write_state({
140
+ 'position' => position
141
+ })
142
+
106
143
  return {
107
144
  position: position,
108
145
  backup: backup_name
@@ -110,62 +147,40 @@ module MongoOplogBackup
110
147
  end
111
148
 
112
149
  def perform(mode=:auto, options={})
113
- if_not_busy = options[:if_not_busy] || false
114
-
115
- perform_oplog_afterwards = false
116
-
117
150
  FileUtils.mkdir_p config.backup_dir
118
- lock(config.lock_file) do
119
- state_file = config.state_file
120
- state = JSON.parse(File.read(state_file)) rescue nil
121
- state ||= {}
122
- have_position = (state['position'] && state['backup'])
151
+ have_backup = backup_folder != nil
123
152
 
124
- if mode == :auto
125
- if have_position
126
- mode = :oplog
127
- else
128
- mode = :full
129
- end
153
+ if mode == :auto
154
+ if have_backup
155
+ mode = :oplog
156
+ else
157
+ mode = :full
130
158
  end
159
+ end
131
160
 
132
- if mode == :oplog
133
- raise "Unknown backup position - cannot perform oplog backup." unless have_position
134
- MongoOplogBackup.log.info "Performing incremental oplog backup"
135
- position = BSON::Timestamp.from_json(state['position'])
136
- result = backup_oplog(start: position, backup: state['backup'])
161
+ if mode == :oplog
162
+ raise "Unknown backup position - cannot perform oplog backup." unless have_backup
163
+ MongoOplogBackup.log.info "Performing incremental oplog backup"
164
+ lock(File.join(backup_folder, 'backup.lock')) do
165
+ result = backup_oplog
137
166
  unless result[:empty]
138
167
  new_entries = result[:entries] - 1
139
- state['position'] = result[:position]
140
- File.write(state_file, state.to_json)
141
168
  MongoOplogBackup.log.info "Backed up #{new_entries} new entries to #{result[:file]}"
142
169
  else
143
170
  MongoOplogBackup.log.info "Nothing new to backup"
144
171
  end
145
- elsif mode == :full
172
+ end
173
+ elsif mode == :full
174
+ lock(config.global_lock_file) do
146
175
  MongoOplogBackup.log.info "Performing full backup"
147
176
  result = backup_full
148
- state = result
149
- File.write(state_file, state.to_json)
177
+ File.write(config.global_state_file, {
178
+ 'backup' => result[:backup]
179
+ }.to_json)
150
180
  MongoOplogBackup.log.info "Performed full backup"
151
-
152
- perform_oplog_afterwards = true
153
181
  end
154
- end
155
-
156
- # Has to be outside the lock
157
- if perform_oplog_afterwards
158
- # Oplog backup
159
182
  perform(:oplog, options)
160
183
  end
161
-
162
- rescue LockError => e
163
- if if_not_busy
164
- MongoOplogBackup.log.info e.message
165
- MongoOplogBackup.log.info 'Not performing backup'
166
- else
167
- raise
168
- end
169
184
  end
170
185
 
171
186
  def latest_oplog_timestamp_moped
@@ -41,18 +41,18 @@ module MongoOplogBackup
41
41
  end
42
42
 
43
43
  def oplog_dump_folder
44
- File.join(backup_dir, 'dump')
44
+ File.join(backup_dir, 'tmp-dump')
45
45
  end
46
46
 
47
47
  def oplog_dump
48
48
  File.join(oplog_dump_folder, 'local/oplog.rs.bson')
49
49
  end
50
50
 
51
- def state_file
51
+ def global_state_file
52
52
  File.join(backup_dir, 'backup.json')
53
53
  end
54
54
 
55
- def lock_file
55
+ def global_lock_file
56
56
  File.join(backup_dir, 'backup.lock')
57
57
  end
58
58
 
@@ -1,3 +1,3 @@
1
1
  module MongoOplogBackup
2
- VERSION = "0.0.6"
2
+ VERSION = "0.0.7"
3
3
  end
@@ -6,7 +6,7 @@ describe MongoOplogBackup do
6
6
  MongoOplogBackup::VERSION.should_not be_nil
7
7
  end
8
8
 
9
- let(:backup) { MongoOplogBackup::Backup.new(MongoOplogBackup::Config.new dir: 'spec-tmp/backup') }
9
+ let(:backup) { MongoOplogBackup::Backup.new(MongoOplogBackup::Config.new(dir: 'spec-tmp/backup'), 'backup1') }
10
10
 
11
11
  before(:all) do
12
12
  # We need one entry in the oplog to start with
@@ -44,12 +44,18 @@ describe MongoOplogBackup do
44
44
  end
45
45
  end
46
46
  last = backup.latest_oplog_timestamp
47
- result = backup.backup_oplog(start: first, backup: 'backup1')
47
+ FileUtils.mkdir_p backup.backup_folder
48
+ backup.write_state({position: first})
49
+ result = backup.backup_oplog(backup: 'backup1')
50
+ result[:entries].should == 6
51
+ result[:empty].should == false
52
+ result[:position].should == last
53
+ result[:first].should == first
54
+
48
55
  file = result[:file]
49
56
  timestamps = MongoOplogBackup::Oplog.oplog_timestamps(file)
50
57
  timestamps.count.should == 6
51
58
  timestamps.first.should == first
52
59
  timestamps.last.should == last
53
-
54
60
  end
55
61
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: mongo-oplog-backup
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.6
4
+ version: 0.0.7
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ralf Kistner
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2014-08-22 00:00:00.000000000 Z
11
+ date: 2014-09-04 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bson