@cloudant/couchbackup 2.10.0-SNAPSHOT.199 → 2.10.1-217
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +85 -73
- package/app.js +254 -269
- package/bin/couchbackup.bin.js +1 -3
- package/bin/couchrestore.bin.js +2 -4
- package/includes/allDocsGenerator.js +53 -0
- package/includes/backup.js +103 -247
- package/includes/backupMappings.js +260 -0
- package/includes/config.js +10 -9
- package/includes/error.js +42 -44
- package/includes/liner.js +134 -23
- package/includes/logfilegetbatches.js +25 -60
- package/includes/logfilesummary.js +41 -71
- package/includes/parser.js +3 -3
- package/includes/request.js +95 -106
- package/includes/restore.js +45 -14
- package/includes/restoreMappings.js +141 -0
- package/includes/spoolchanges.js +57 -79
- package/includes/transforms.js +378 -0
- package/package.json +7 -10
- package/includes/change.js +0 -41
- package/includes/shallowbackup.js +0 -80
- package/includes/writer.js +0 -164
package/README.md
CHANGED
|
@@ -14,15 +14,15 @@
|
|
|
14
14
|
|_|
|
|
15
15
|
```
|
|
16
16
|
|
|
17
|
-
CouchBackup is a command-line utility that
|
|
17
|
+
CouchBackup is a command-line utility that backs up a Cloudant or CouchDB database to a text file.
|
|
18
18
|
It comes with a companion command-line utility that can restore the backed up data.
|
|
19
19
|
|
|
20
20
|
## Limitations
|
|
21
21
|
|
|
22
|
-
|
|
22
|
+
CouchBackup has some restrictions in the data it's able to backup:
|
|
23
23
|
|
|
24
|
-
*
|
|
25
|
-
*
|
|
24
|
+
* **`couchbackup` does not do CouchDB replication as such, it simply streams through a database's `_changes` feed, and uses `POST /db/_bulk_get` to fetch the documents, storing the documents it finds on disk.**
|
|
25
|
+
* **`couchbackup` does not support backing up or restoring databases containing documents with attachments. The recommendation is to store attachments directly in an object store. DO NOT USE THIS TOOL FOR DATABASES CONTAINING ATTACHMENTS.** [Note](#note-on-attachments)
|
|
26
26
|
|
|
27
27
|
## Installation
|
|
28
28
|
|
|
@@ -38,11 +38,11 @@ npm install -g @cloudant/couchbackup
|
|
|
38
38
|
|
|
39
39
|
### Snapshots
|
|
40
40
|
|
|
41
|
-
The latest builds of main are
|
|
41
|
+
The latest builds of the `main` branch are available on npm with the `snapshot` tag. Use the `snapshot` tag if you want to experiment with an unreleased fix or new function, but please note that snapshot versions are **not supported**.
|
|
42
42
|
|
|
43
43
|
## Usage
|
|
44
44
|
|
|
45
|
-
|
|
45
|
+
Use either environment variables or command-line options to specify the URL of the CouchDB or Cloudant instance, and the database to work with.
|
|
46
46
|
|
|
47
47
|
### The URL
|
|
48
48
|
|
|
@@ -57,19 +57,19 @@ or
|
|
|
57
57
|
export COUCH_URL=https://myusername:mypassword@myhost.cloudant.com
|
|
58
58
|
```
|
|
59
59
|
|
|
60
|
-
|
|
60
|
+
Or use the `--url` command-line parameter.
|
|
61
61
|
|
|
62
62
|
When passing credentials in the user information subcomponent of the URL
|
|
63
63
|
they must be [percent encoded](https://tools.ietf.org/html/rfc3986#section-3.2.1).
|
|
64
64
|
Specifically, within either the username or password, the characters `: / ? # [ ] @ %`
|
|
65
|
-
_MUST_ be precent-encoded, other characters _MAY_ be percent
|
|
65
|
+
_MUST_ be precent-encoded, other characters _MAY_ be percent-encoded.
|
|
66
66
|
|
|
67
67
|
For example, for the username `user123` and password `colon:at@321`:
|
|
68
68
|
```
|
|
69
69
|
https://user123:colon%3aat%40321@localhost:5984
|
|
70
70
|
```
|
|
71
71
|
|
|
72
|
-
Note
|
|
72
|
+
Note take extra care to escape shell reserved characters when
|
|
73
73
|
setting the environment variable or command-line parameter.
|
|
74
74
|
|
|
75
75
|
### The Database name
|
|
@@ -80,7 +80,7 @@ To define the name of the database to backup or restore, set the `COUCH_DATABASE
|
|
|
80
80
|
export COUCH_DATABASE=animaldb
|
|
81
81
|
```
|
|
82
82
|
|
|
83
|
-
|
|
83
|
+
Or use the `--db` command-line parameter
|
|
84
84
|
|
|
85
85
|
## Backup
|
|
86
86
|
|
|
@@ -98,19 +98,19 @@ couchbackup --db animaldb > animaldb.txt
|
|
|
98
98
|
|
|
99
99
|
## Logging & resuming backups
|
|
100
100
|
|
|
101
|
-
You may also create a log file which records the progress of the backup with the `--log` parameter
|
|
101
|
+
You may also create a log file which records the progress of the backup with the `--log` parameter, for example:
|
|
102
102
|
|
|
103
103
|
```sh
|
|
104
104
|
couchbackup --db animaldb --log animaldb.log > animaldb.txt
|
|
105
105
|
```
|
|
106
106
|
|
|
107
|
-
|
|
107
|
+
Use this log file to resume backups with `--resume true`:
|
|
108
108
|
|
|
109
109
|
```sh
|
|
110
110
|
couchbackup --db animaldb --log animaldb.log --resume true >> animaldb.txt
|
|
111
111
|
```
|
|
112
112
|
|
|
113
|
-
The `--resume true` option works for a backup that has finished spooling changes, but has not yet completed downloading all the necessary batches of documents. It
|
|
113
|
+
The `--resume true` option works for a backup that has finished spooling changes, but has not yet completed downloading all the necessary batches of documents. It _is not an incremental backup_ solution.
|
|
114
114
|
|
|
115
115
|
You may also specify the name of the output file, rather than directing the backup data to *stdout*:
|
|
116
116
|
|
|
@@ -118,9 +118,13 @@ You may also specify the name of the output file, rather than directing the back
|
|
|
118
118
|
couchbackup --db animaldb --log animaldb.log --resume true --output animaldb.txt
|
|
119
119
|
```
|
|
120
120
|
|
|
121
|
+
### Compatibility note
|
|
122
|
+
|
|
123
|
+
When using `--resume` use the same version of `couchbackup` that started the backup.
|
|
124
|
+
|
|
121
125
|
## Restore
|
|
122
126
|
|
|
123
|
-
Now
|
|
127
|
+
Now restore the backup text file to a new, empty, existing database using the `couchrestore`:
|
|
124
128
|
|
|
125
129
|
```sh
|
|
126
130
|
cat animaldb.txt | couchrestore
|
|
@@ -132,9 +136,15 @@ or specifying the database name on the command-line:
|
|
|
132
136
|
cat animaldb.txt | couchrestore --db animaldb2
|
|
133
137
|
```
|
|
134
138
|
|
|
139
|
+
### Compatibility note
|
|
140
|
+
|
|
141
|
+
**Do not use an older version of `couchbackup` to restore a backup created with a newer version.**
|
|
142
|
+
|
|
143
|
+
Newer versions of `couchbackup` can restore backups created by older versions within the same major version.
|
|
144
|
+
|
|
135
145
|
## Compressed backups
|
|
136
146
|
|
|
137
|
-
|
|
147
|
+
To compress the backup data before storing to disk pipe the contents through `gzip`:
|
|
138
148
|
|
|
139
149
|
```sh
|
|
140
150
|
couchbackup --db animaldb | gzip > animaldb.txt.gz
|
|
@@ -159,64 +169,69 @@ couchbackup --db animaldb | openssl aes-128-cbc -pass pass:12345 > encrypted_ani
|
|
|
159
169
|
openssl aes-128-cbc -d -in encrypted_animal.db -pass pass:12345 | couchrestore --db animaldb2
|
|
160
170
|
```
|
|
161
171
|
|
|
162
|
-
Note that the content is
|
|
163
|
-
backup tool before
|
|
172
|
+
Note that the content is not encrypted in the
|
|
173
|
+
backup tool before piping to the encryption utility.
|
|
164
174
|
|
|
165
175
|
## What's in a backup file?
|
|
166
176
|
|
|
167
|
-
A backup file is a text file where each line
|
|
177
|
+
A backup file is a text file where each line is either a JSON object of backup metadata
|
|
178
|
+
or a JSON array of backed up document revision objects, for example:
|
|
168
179
|
|
|
169
|
-
```
|
|
170
|
-
|
|
171
|
-
|
|
180
|
+
```json
|
|
181
|
+
{"name":"@cloudant/couchbackup","version":"2.9.10","mode":"full"}
|
|
182
|
+
[{"_id": "1","a":1},{"_id": "2","a":2},...]
|
|
183
|
+
[{"_id": "501","a":501},{"_id": "502","a":502}]
|
|
172
184
|
```
|
|
173
185
|
|
|
186
|
+
The number of document revisions in a backup array varies. It typically has
|
|
187
|
+
`buffer_size` elements, but may be more if there are also leaf revisions returned
|
|
188
|
+
from the server or fewer if it is the last batch.
|
|
189
|
+
|
|
174
190
|
## What's in a log file?
|
|
175
191
|
|
|
176
|
-
A log file
|
|
192
|
+
A log file has a line:
|
|
177
193
|
|
|
178
|
-
- for every batch of document ids that
|
|
179
|
-
- for every batch that has
|
|
180
|
-
- to indicate that the changes feed was fully consumed
|
|
194
|
+
- for every batch of document ids that `couchbackup` needs to fetch, for example: `:t batch56 [{"id":"a"},{"id":"b"}]`
|
|
195
|
+
- for every batch that `couchbackup` has fetched and stored, for example: `:d batch56`
|
|
196
|
+
- to indicate that the changes feed was fully consumed, for example: `:changes_complete`
|
|
181
197
|
|
|
182
|
-
## What
|
|
198
|
+
## What's shallow mode?
|
|
183
199
|
|
|
184
|
-
When you run `couchbackup` with `--mode shallow` a simpler backup
|
|
185
|
-
|
|
186
|
-
|
|
200
|
+
When you run `couchbackup` with `--mode shallow` `couchbackup` performs a simpler backup.
|
|
201
|
+
It only backs up the winning revisions and ignores any conflicting revisions.
|
|
202
|
+
This is a faster, but less complete backup.
|
|
187
203
|
|
|
188
|
-
|
|
204
|
+
_Note:_ The `--log`, `--resume`, and `--parallelism` are invalid for `--mode shallow` backups.
|
|
189
205
|
|
|
190
206
|
## Why use CouchBackup?
|
|
191
207
|
|
|
192
208
|
The easiest way to backup a CouchDB database is to copy the ".couch" file. This is fine on a single-node instance, but when running multi-node
|
|
193
|
-
Cloudant or using CouchDB 2.0 or greater, the ".couch" file only
|
|
209
|
+
Cloudant or using CouchDB 2.0 or greater, the ".couch" file only has a single shard of data. This utility allows simple backups of CouchDB
|
|
194
210
|
or Cloudant database using the HTTP API.
|
|
195
211
|
|
|
196
|
-
This tool can
|
|
212
|
+
This tool can script the backup of your databases. Move the backup and log files to cheap Object Storage so that you have copies of your precious data.
|
|
197
213
|
|
|
198
214
|
## Options reference
|
|
199
215
|
|
|
200
216
|
### Environment variables
|
|
201
217
|
|
|
202
|
-
* `COUCH_URL` - the URL of the CouchDB/Cloudant server
|
|
203
|
-
* `COUCH_DATABASE` - the name of the database to act upon
|
|
204
|
-
* `COUCH_PARALLELISM` - the number of HTTP requests to perform in parallel when restoring a backup
|
|
205
|
-
* `COUCH_BUFFER_SIZE` - the number of documents fetched and restored at once
|
|
206
|
-
* `COUCH_REQUEST_TIMEOUT` - the number of milliseconds to wait for a
|
|
218
|
+
* `COUCH_URL` - the URL of the CouchDB/Cloudant server, for example: `http://127.0.0.1:5984`
|
|
219
|
+
* `COUCH_DATABASE` - the name of the database to act upon, for example: `mydb` (default `test`)
|
|
220
|
+
* `COUCH_PARALLELISM` - the number of HTTP requests to perform in parallel when restoring a backup, for example: `10` (Default `5`)
|
|
221
|
+
* `COUCH_BUFFER_SIZE` - the number of documents fetched and restored at once, for example: `100` (default `500`).
|
|
222
|
+
* `COUCH_REQUEST_TIMEOUT` - the number of milliseconds to wait for a response to a HTTP request before retrying the request, for example: `10000` (Default `120000`)
|
|
207
223
|
* `COUCH_LOG` - the file to store logging information during backup
|
|
208
|
-
* `COUCH_RESUME` - if `true`, resumes
|
|
224
|
+
* `COUCH_RESUME` - if `true`, resumes an earlier backup from its last known position (requires a log file)
|
|
209
225
|
* `COUCH_OUTPUT` - the file name to store the backup data (defaults to stdout)
|
|
210
|
-
* `COUCH_MODE` - if `shallow`, only a superficial backup
|
|
226
|
+
* `COUCH_MODE` - if `shallow`, does only a superficial backup ignoring conflicts. Defaults to `full` - a full backup.
|
|
211
227
|
* `COUCH_QUIET` - if `true`, suppresses the individual batch messages to the console during CLI backup and restore
|
|
212
228
|
* `CLOUDANT_IAM_API_KEY` - optional [IAM API key](https://console.bluemix.net/docs/services/Cloudant/guides/iam.html#ibm-cloud-identity-and-access-management)
|
|
213
229
|
to use to access the Cloudant database instead of user information credentials in the URL. The endpoint used to retrieve the token defaults to
|
|
214
230
|
`https://iam.cloud.ibm.com/identity/token`, but can be overridden if necessary using the `CLOUDANT_IAM_TOKEN_URL` environment variable.
|
|
215
|
-
* `DEBUG` - if set to `couchbackup`, all debug messages
|
|
231
|
+
* `DEBUG` - if set to `couchbackup`, all debug messages print on `stderr` during a backup or restore process
|
|
216
232
|
|
|
217
|
-
_Note:_
|
|
218
|
-
[using programmatically](#using-programmatically) the `opts` dictionary
|
|
219
|
-
used.
|
|
233
|
+
_Note:_ Environment variables are only used with the CLI. When
|
|
234
|
+
[using programmatically](#using-programmatically) use the `opts` dictionary.
|
|
220
235
|
|
|
221
236
|
### Command-line parameters
|
|
222
237
|
|
|
@@ -234,7 +249,7 @@ used.
|
|
|
234
249
|
|
|
235
250
|
## Using programmatically
|
|
236
251
|
|
|
237
|
-
You can use `couchbackup`
|
|
252
|
+
You can use `couchbackup` programmatically. First install
|
|
238
253
|
`couchbackup` into your project with `npm install --save @cloudant/couchbackup`.
|
|
239
254
|
Then you can import the library into your code:
|
|
240
255
|
|
|
@@ -272,11 +287,11 @@ target locations are not required.
|
|
|
272
287
|
* `resume`: see `COUCH_RESUME`.
|
|
273
288
|
* `mode`: see `COUCH_MODE`.
|
|
274
289
|
* `iamApiKey`: see `CLOUDANT_IAM_API_KEY`.
|
|
275
|
-
* `iamTokenUrl`:
|
|
290
|
+
* `iamTokenUrl`: optionally used with `iamApiKey` to override the default URL for
|
|
276
291
|
retrieving IAM tokens.
|
|
277
292
|
|
|
278
|
-
|
|
279
|
-
the
|
|
293
|
+
When the backup completes or fails the callback functions gets called with
|
|
294
|
+
the standard `err, data` parameters.
|
|
280
295
|
|
|
281
296
|
The `backup` function returns an event emitter. You can subscribe to:
|
|
282
297
|
|
|
@@ -335,25 +350,20 @@ target locations are not required.
|
|
|
335
350
|
* `bufferSize`: see `COUCH_BUFFER_SIZE`.
|
|
336
351
|
* `requestTimeout`: see `COUCH_REQUEST_TIMEOUT`.
|
|
337
352
|
* `iamApiKey`: see `CLOUDANT_IAM_API_KEY`.
|
|
338
|
-
* `iamTokenUrl`:
|
|
353
|
+
* `iamTokenUrl`: optionally used with `iamApiKey` to override the default URL for
|
|
339
354
|
retrieving IAM tokens.
|
|
340
355
|
|
|
341
|
-
|
|
342
|
-
the
|
|
356
|
+
When the restore completes or fails the callback functions gets called with
|
|
357
|
+
the standard `err, data` parameters.
|
|
343
358
|
|
|
344
359
|
The `restore` function returns an event emitter. You can subscribe to:
|
|
345
360
|
|
|
346
361
|
* `restored` - when a batch of documents is restored.
|
|
347
362
|
* `finished` - emitted once when all documents are restored.
|
|
348
363
|
|
|
349
|
-
The
|
|
350
|
-
|
|
351
|
-
|
|
352
|
-
|
|
353
|
-
It's possible a list could be corrupt due to failures in the backup process. A
|
|
354
|
-
`BackupFileJsonError` is emitted for each corrupt list found. _These can only be
|
|
355
|
-
ignored if the backup that generated the stream did complete successfully_. This
|
|
356
|
-
ensures that corrupt lists also have a valid counterpart within the stream.
|
|
364
|
+
The `srcStream` for the restore is a [backup file](#whats-in-a-backup-file).
|
|
365
|
+
In the case of an incomplete backup the file could be corrupt and in that
|
|
366
|
+
case the restore emits a `BackupFileJsonError`.
|
|
357
367
|
|
|
358
368
|
Restore data from a stream:
|
|
359
369
|
|
|
@@ -389,19 +399,20 @@ couchbackup.restore(
|
|
|
389
399
|
|
|
390
400
|
## Error Handling
|
|
391
401
|
|
|
392
|
-
The `couchbackup` and `couchrestore` processes are
|
|
393
|
-
|
|
394
|
-
|
|
402
|
+
The `couchbackup` and `couchrestore` processes are able to tolerate many errors even over an unreliable network.
|
|
403
|
+
Failed requests retry at least twice after a back-off delay.
|
|
404
|
+
However, certain errors can't tolerate failures:
|
|
405
|
+
- invalid configuration
|
|
406
|
+
- failed validation checks (for example: auth, database existence, `_bulk_get` endpoint avaialbility)
|
|
395
407
|
|
|
396
408
|
### API
|
|
397
409
|
|
|
398
|
-
When using the library programmatically
|
|
399
|
-
|
|
400
|
-
* For non-fatal errors an `error` event will be emitted
|
|
410
|
+
When using the library programmatically in the case of a fatal error
|
|
411
|
+
the callback function gets called with `null, error` arguments.
|
|
401
412
|
|
|
402
413
|
### CLI Exit Codes
|
|
403
414
|
|
|
404
|
-
On fatal errors, `couchbackup` and `couchrestore`
|
|
415
|
+
On fatal errors, `couchbackup` and `couchrestore` exit with non-zero exit codes. This section
|
|
405
416
|
details them.
|
|
406
417
|
|
|
407
418
|
### common to both `couchbackup` and `couchrestore`
|
|
@@ -410,14 +421,15 @@ details them.
|
|
|
410
421
|
* `2`: invalid CLI option.
|
|
411
422
|
* `10`: backup source or restore target database does not exist.
|
|
412
423
|
* `11`: unauthorized credentials for the database.
|
|
413
|
-
* `12`:
|
|
424
|
+
* `12`: invalid permissions for the database.
|
|
414
425
|
* `40`: database returned a fatal HTTP error.
|
|
415
426
|
|
|
416
427
|
### `couchbackup`
|
|
417
428
|
|
|
418
|
-
* `20`: resume
|
|
429
|
+
* `20`: `--resume` without a log file.
|
|
419
430
|
* `21`: the resume log file does not exist.
|
|
420
431
|
* `22`: incomplete changes in log file.
|
|
432
|
+
* `23`: the log file already exists, but `--resume` was not used.
|
|
421
433
|
* `30`: error spooling changes from the database.
|
|
422
434
|
* `50`: source database does not support `/_bulk_get` endpoint.
|
|
423
435
|
|
|
@@ -427,13 +439,13 @@ details them.
|
|
|
427
439
|
|
|
428
440
|
## Note on attachments
|
|
429
441
|
|
|
430
|
-
TLDR; If you backup a database that
|
|
442
|
+
TLDR; If you backup a database that has attachments `couchbackup` cannot restore it.
|
|
431
443
|
|
|
432
|
-
As documented above couchbackup does not support backing up or restoring databases containing documents with attachments.
|
|
433
|
-
|
|
434
|
-
content
|
|
435
|
-
restore the backup
|
|
444
|
+
As documented above `couchbackup` does not support backing up or restoring databases containing documents with attachments.
|
|
445
|
+
Backing up a database that includes documents with attachments appears to complete successfully. However, the attachment
|
|
446
|
+
content is not downloaded and the backup file contains attachment metadata. So attempts to
|
|
447
|
+
restore the backup result in errors because the attachment metadata references attachments that are not present
|
|
436
448
|
in the restored database.
|
|
437
449
|
|
|
438
|
-
|
|
450
|
+
The recommendation is to store attachments directly in an object store with a link in the JSON document instead of using the
|
|
439
451
|
native attachment API.
|