@cloudant/couchbackup 2.10.0-SNAPSHOT.199 → 2.10.1-217

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -14,15 +14,15 @@
14
14
  |_|
15
15
  ```
16
16
 
17
- CouchBackup is a command-line utility that allows a Cloudant or CouchDB database to be backed up to a text file.
17
+ CouchBackup is a command-line utility that backs up a Cloudant or CouchDB database to a text file.
18
18
  It comes with a companion command-line utility that can restore the backed up data.
19
19
 
20
20
  ## Limitations
21
21
 
22
- Couchbackup has some restrictions in the data it's able to backup:
22
+ CouchBackup has some restrictions in the data it's able to backup:
23
23
 
24
- * **couchbackup does not do CouchDB replication as such, it simply streams through a database's `_changes` feed, and uses `POST /db/_bulk_get` to fetch the documents, storing the documents it finds on disk.**
25
- * **couchbackup does not support backing up or restoring databases containing documents with attachments. It is recommended to store attachments directly in an object store. DO NOT USE THIS TOOL FOR DATABASES CONTAINING ATTACHMENTS.** [Note](#note-on-attachments)
24
+ * **`couchbackup` does not do CouchDB replication as such, it simply streams through a database's `_changes` feed, and uses `POST /db/_bulk_get` to fetch the documents, storing the documents it finds on disk.**
25
+ * **`couchbackup` does not support backing up or restoring databases containing documents with attachments. The recommendation is to store attachments directly in an object store. DO NOT USE THIS TOOL FOR DATABASES CONTAINING ATTACHMENTS.** [Note](#note-on-attachments)
26
26
 
27
27
  ## Installation
28
28
 
@@ -38,11 +38,11 @@ npm install -g @cloudant/couchbackup
38
38
 
39
39
  ### Snapshots
40
40
 
41
- The latest builds of main are published to npm with the `snapshot` tag. Use the `snapshot` tag if you want to experiment with an unreleased fix or new function, but please note that snapshot versions are **unsupported**.
41
+ The latest builds of the `main` branch are available on npm with the `snapshot` tag. Use the `snapshot` tag if you want to experiment with an unreleased fix or new function, but please note that snapshot versions are **not supported**.
42
42
 
43
43
  ## Usage
44
44
 
45
- Either environment variables or command-line options can be used to specify the URL of the CouchDB or Cloudant instance, and the database to work with.
45
+ Use either environment variables or command-line options to specify the URL of the CouchDB or Cloudant instance, and the database to work with.
46
46
 
47
47
  ### The URL
48
48
 
@@ -57,19 +57,19 @@ or
57
57
  export COUCH_URL=https://myusername:mypassword@myhost.cloudant.com
58
58
  ```
59
59
 
60
- Alternatively we can use the `--url` command-line parameter.
60
+ Or use the `--url` command-line parameter.
61
61
 
62
62
  When passing credentials in the user information subcomponent of the URL
63
63
  they must be [percent encoded](https://tools.ietf.org/html/rfc3986#section-3.2.1).
64
64
  Specifically, within either the username or password, the characters `: / ? # [ ] @ %`
65
- _MUST_ be precent-encoded, other characters _MAY_ be percent encoded.
65
+ _MUST_ be precent-encoded, other characters _MAY_ be percent-encoded.
66
66
 
67
67
  For example, for the username `user123` and password `colon:at@321`:
68
68
  ```
69
69
  https://user123:colon%3aat%40321@localhost:5984
70
70
  ```
71
71
 
72
- Note that additional care must be taken to escape shell reserved characters when
72
+ Note take extra care to escape shell reserved characters when
73
73
  setting the environment variable or command-line parameter.
74
74
 
75
75
  ### The Database name
@@ -80,7 +80,7 @@ To define the name of the database to backup or restore, set the `COUCH_DATABASE
80
80
  export COUCH_DATABASE=animaldb
81
81
  ```
82
82
 
83
- Alternatively we can use the `--db` command-line parameter
83
+ Or use the `--db` command-line parameter
84
84
 
85
85
  ## Backup
86
86
 
@@ -98,19 +98,19 @@ couchbackup --db animaldb > animaldb.txt
98
98
 
99
99
  ## Logging & resuming backups
100
100
 
101
- You may also create a log file which records the progress of the backup with the `--log` parameter e.g.
101
+ You may also create a log file which records the progress of the backup with the `--log` parameter, for example:
102
102
 
103
103
  ```sh
104
104
  couchbackup --db animaldb --log animaldb.log > animaldb.txt
105
105
  ```
106
106
 
107
- This log file can be used to resume backups from where you left off with `--resume true`:
107
+ Use this log file to resume backups with `--resume true`:
108
108
 
109
109
  ```sh
110
110
  couchbackup --db animaldb --log animaldb.log --resume true >> animaldb.txt
111
111
  ```
112
112
 
113
- The `--resume true` option works for a backup that has finished spooling changes, but has not yet completed downloading all the necessary batches of documents. It does _not_ provide an incremental backup solution.
113
+ The `--resume true` option works for a backup that has finished spooling changes, but has not yet completed downloading all the necessary batches of documents. It _is not an incremental backup_ solution.
114
114
 
115
115
  You may also specify the name of the output file, rather than directing the backup data to *stdout*:
116
116
 
@@ -118,9 +118,13 @@ You may also specify the name of the output file, rather than directing the back
118
118
  couchbackup --db animaldb --log animaldb.log --resume true --output animaldb.txt
119
119
  ```
120
120
 
121
+ ### Compatibility note
122
+
123
+ When using `--resume` use the same version of `couchbackup` that started the backup.
124
+
121
125
  ## Restore
122
126
 
123
- Now that we have our backup text file, we can restore it to a new, empty, existing database using the `couchrestore`:
127
+ Now restore the backup text file to a new, empty, existing database using the `couchrestore`:
124
128
 
125
129
  ```sh
126
130
  cat animaldb.txt | couchrestore
@@ -132,9 +136,15 @@ or specifying the database name on the command-line:
132
136
  cat animaldb.txt | couchrestore --db animaldb2
133
137
  ```
134
138
 
139
+ ### Compatibility note
140
+
141
+ **Do not use an older version of `couchbackup` to restore a backup created with a newer version.**
142
+
143
+ Newer versions of `couchbackup` can restore backups created by older versions within the same major version.
144
+
135
145
  ## Compressed backups
136
146
 
137
- If we want to compress the backup data before storing to disk, we can pipe the contents through `gzip`:
147
+ To compress the backup data before storing to disk pipe the contents through `gzip`:
138
148
 
139
149
  ```sh
140
150
  couchbackup --db animaldb | gzip > animaldb.txt.gz
@@ -159,64 +169,69 @@ couchbackup --db animaldb | openssl aes-128-cbc -pass pass:12345 > encrypted_ani
159
169
  openssl aes-128-cbc -d -in encrypted_animal.db -pass pass:12345 | couchrestore --db animaldb2
160
170
  ```
161
171
 
162
- Note that the content is unencrypted while it is being processed by the
163
- backup tool before it is piped to the encryption utility.
172
+ Note that the content is not encrypted in the
173
+ backup tool before piping to the encryption utility.
164
174
 
165
175
  ## What's in a backup file?
166
176
 
167
- A backup file is a text file where each line contains a JSON encoded array of up to `buffer-size` objects e.g.
177
+ A backup file is a text file where each line is either a JSON object of backup metadata
178
+ or a JSON array of backed up document revision objects, for example:
168
179
 
169
- ```js
170
- [{"a":1},{"a":2}...]
171
- [{"a":501},{"a":502}...]
180
+ ```json
181
+ {"name":"@cloudant/couchbackup","version":"2.9.10","mode":"full"}
182
+ [{"_id": "1","a":1},{"_id": "2","a":2},...]
183
+ [{"_id": "501","a":501},{"_id": "502","a":502}]
172
184
  ```
173
185
 
186
+ The number of document revisions in a backup array varies. It typically has
187
+ `buffer_size` elements, but may be more if there are also leaf revisions returned
188
+ from the server or fewer if it is the last batch.
189
+
174
190
  ## What's in a log file?
175
191
 
176
- A log file contains a line:
192
+ A log file has a line:
177
193
 
178
- - for every batch of document ids that need to be fetched e.g. `:t batch56 [{"id":"a"},{"id":"b"}]`
179
- - for every batch that has been fetched and stored e.g. `:d batch56`
180
- - to indicate that the changes feed was fully consumed e.g. `:changes_complete`
194
+ - for every batch of document ids that `couchbackup` needs to fetch, for example: `:t batch56 [{"id":"a"},{"id":"b"}]`
195
+ - for every batch that `couchbackup` has fetched and stored, for example: `:d batch56`
196
+ - to indicate that the changes feed was fully consumed, for example: `:changes_complete`
181
197
 
182
- ## What is shallow mode?
198
+ ## What's shallow mode?
183
199
 
184
- When you run `couchbackup` with `--mode shallow` a simpler backup is performed, only backing up the winning revisions
185
- of the database. No revision tokens are saved and any conflicting revisions are ignored. This is a faster, but less
186
- complete backup. Shallow backups cannot be resumed because they do not produce a log file.
200
+ When you run `couchbackup` with `--mode shallow` `couchbackup` performs a simpler backup.
201
+ It only backs up the winning revisions and ignores any conflicting revisions.
202
+ This is a faster, but less complete backup.
187
203
 
188
- NOTE: Parallellism will not be in effect if `--mode shallow` is defined.
204
+ _Note:_ The `--log`, `--resume`, and `--parallelism` are invalid for `--mode shallow` backups.
189
205
 
190
206
  ## Why use CouchBackup?
191
207
 
192
208
  The easiest way to backup a CouchDB database is to copy the ".couch" file. This is fine on a single-node instance, but when running multi-node
193
- Cloudant or using CouchDB 2.0 or greater, the ".couch" file only contains a single shard of data. This utility allows simple backups of CouchDB
209
+ Cloudant or using CouchDB 2.0 or greater, the ".couch" file only has a single shard of data. This utility allows simple backups of CouchDB
194
210
  or Cloudant database using the HTTP API.
195
211
 
196
- This tool can be used to script the backup of your databases. Move the backup and log files to cheap Object Storage so that you have multiple copies of your precious data.
212
+ This tool can script the backup of your databases. Move the backup and log files to cheap Object Storage so that you have copies of your precious data.
197
213
 
198
214
  ## Options reference
199
215
 
200
216
  ### Environment variables
201
217
 
202
- * `COUCH_URL` - the URL of the CouchDB/Cloudant server e.g. `http://127.0.0.1:5984`
203
- * `COUCH_DATABASE` - the name of the database to act upon e.g. `mydb` (default `test`)
204
- * `COUCH_PARALLELISM` - the number of HTTP requests to perform in parallel when restoring a backup e.g. `10` (Default `5`)
205
- * `COUCH_BUFFER_SIZE` - the number of documents fetched and restored at once e.g. `100` (default `500`). When using CouchBackup with [Cloudant on Transaction Engine](https://www.ibm.com/cloud/blog/announcements/ibm-cloudant-on-transaction-engine) `COUCH_BUFFER_SIZE` must be less than `2000` to avoid bad request errors.
206
- * `COUCH_REQUEST_TIMEOUT` - the number of milliseconds to wait for a respose to a HTTP request before retrying the request e.g. `10000` (Default `120000`)
218
+ * `COUCH_URL` - the URL of the CouchDB/Cloudant server, for example: `http://127.0.0.1:5984`
219
+ * `COUCH_DATABASE` - the name of the database to act upon, for example: `mydb` (default `test`)
220
+ * `COUCH_PARALLELISM` - the number of HTTP requests to perform in parallel when restoring a backup, for example: `10` (Default `5`)
221
+ * `COUCH_BUFFER_SIZE` - the number of documents fetched and restored at once, for example: `100` (default `500`).
222
+ * `COUCH_REQUEST_TIMEOUT` - the number of milliseconds to wait for a response to a HTTP request before retrying the request, for example: `10000` (Default `120000`)
207
223
  * `COUCH_LOG` - the file to store logging information during backup
208
- * `COUCH_RESUME` - if `true`, resumes a previous backup from its last known position
224
+ * `COUCH_RESUME` - if `true`, resumes an earlier backup from its last known position (requires a log file)
209
225
  * `COUCH_OUTPUT` - the file name to store the backup data (defaults to stdout)
210
- * `COUCH_MODE` - if `shallow`, only a superficial backup is done, ignoring conflicts and revision tokens. Defaults to `full` - a full backup.
226
+ * `COUCH_MODE` - if `shallow`, does only a superficial backup ignoring conflicts. Defaults to `full` - a full backup.
211
227
  * `COUCH_QUIET` - if `true`, suppresses the individual batch messages to the console during CLI backup and restore
212
228
  * `CLOUDANT_IAM_API_KEY` - optional [IAM API key](https://console.bluemix.net/docs/services/Cloudant/guides/iam.html#ibm-cloud-identity-and-access-management)
213
229
  to use to access the Cloudant database instead of user information credentials in the URL. The endpoint used to retrieve the token defaults to
214
230
  `https://iam.cloud.ibm.com/identity/token`, but can be overridden if necessary using the `CLOUDANT_IAM_TOKEN_URL` environment variable.
215
- * `DEBUG` - if set to `couchbackup`, all debug messages will be sent to `stderr` during a backup or restore process
231
+ * `DEBUG` - if set to `couchbackup`, all debug messages print on `stderr` during a backup or restore process
216
232
 
217
- _Note:_ These environment variables can only be used with the CLI. When
218
- [using programmatically](#using-programmatically) the `opts` dictionary must be
219
- used.
233
+ _Note:_ Environment variables are only used with the CLI. When
234
+ [using programmatically](#using-programmatically) use the `opts` dictionary.
220
235
 
221
236
  ### Command-line parameters
222
237
 
@@ -234,7 +249,7 @@ used.
234
249
 
235
250
  ## Using programmatically
236
251
 
237
- You can use `couchbackup` programatically. First install
252
+ You can use `couchbackup` programmatically. First install
238
253
  `couchbackup` into your project with `npm install --save @cloudant/couchbackup`.
239
254
  Then you can import the library into your code:
240
255
 
@@ -272,11 +287,11 @@ target locations are not required.
272
287
  * `resume`: see `COUCH_RESUME`.
273
288
  * `mode`: see `COUCH_MODE`.
274
289
  * `iamApiKey`: see `CLOUDANT_IAM_API_KEY`.
275
- * `iamTokenUrl`: may be used with `iamApiKey` to override the default URL for
290
+ * `iamTokenUrl`: optionally used with `iamApiKey` to override the default URL for
276
291
  retrieving IAM tokens.
277
292
 
278
- The callback has the standard `err, data` parameters and is called when
279
- the backup completes or fails.
293
+ When the backup completes or fails the callback functions gets called with
294
+ the standard `err, data` parameters.
280
295
 
281
296
  The `backup` function returns an event emitter. You can subscribe to:
282
297
 
@@ -335,25 +350,20 @@ target locations are not required.
335
350
  * `bufferSize`: see `COUCH_BUFFER_SIZE`.
336
351
  * `requestTimeout`: see `COUCH_REQUEST_TIMEOUT`.
337
352
  * `iamApiKey`: see `CLOUDANT_IAM_API_KEY`.
338
- * `iamTokenUrl`: may be used with `iamApiKey` to override the default URL for
353
+ * `iamTokenUrl`: optionally used with `iamApiKey` to override the default URL for
339
354
  retrieving IAM tokens.
340
355
 
341
- The callback has the standard `err, data` parameters and is called when
342
- the restore completes or fails.
356
+ When the restore completes or fails the callback functions gets called with
357
+ the standard `err, data` parameters.
343
358
 
344
359
  The `restore` function returns an event emitter. You can subscribe to:
345
360
 
346
361
  * `restored` - when a batch of documents is restored.
347
362
  * `finished` - emitted once when all documents are restored.
348
363
 
349
- The backup file (or `srcStream`) contains lists comprising of document
350
- revisions, where each list is separated by a newline. The list length is
351
- dictated by the `bufferSize` parameter used during the backup.
352
-
353
- It's possible a list could be corrupt due to failures in the backup process. A
354
- `BackupFileJsonError` is emitted for each corrupt list found. _These can only be
355
- ignored if the backup that generated the stream did complete successfully_. This
356
- ensures that corrupt lists also have a valid counterpart within the stream.
364
+ The `srcStream` for the restore is a [backup file](#whats-in-a-backup-file).
365
+ In the case of an incomplete backup the file could be corrupt and in that
366
+ case the restore emits a `BackupFileJsonError`.
357
367
 
358
368
  Restore data from a stream:
359
369
 
@@ -389,19 +399,20 @@ couchbackup.restore(
389
399
 
390
400
  ## Error Handling
391
401
 
392
- The `couchbackup` and `couchrestore` processes are designed to be relatively robust over an unreliable network. Work is batched and any failed requests are retried indefinitely. However, certain aspects of the execution will not tolerate failure:
393
- - Spooling changes from the database changes feed. A failure in the changes request during the backup process will result in process termination.
394
- - Validating the existence of a target database during the database restore process.
402
+ The `couchbackup` and `couchrestore` processes are able to tolerate many errors even over an unreliable network.
403
+ Failed requests retry at least twice after a back-off delay.
404
+ However, certain errors can't tolerate failures:
405
+ - invalid configuration
406
+ - failed validation checks (for example: auth, database existence, `_bulk_get` endpoint avaialbility)
395
407
 
396
408
  ### API
397
409
 
398
- When using the library programmatically an `Error` will be passed in one of two ways:
399
- * For fatal errors the callback will be called with `null, error` arguments
400
- * For non-fatal errors an `error` event will be emitted
410
+ When using the library programmatically in the case of a fatal error
411
+ the callback function gets called with `null, error` arguments.
401
412
 
402
413
  ### CLI Exit Codes
403
414
 
404
- On fatal errors, `couchbackup` and `couchrestore` will exit with non-zero exit codes. This section
415
+ On fatal errors, `couchbackup` and `couchrestore` exit with non-zero exit codes. This section
405
416
  details them.
406
417
 
407
418
  ### common to both `couchbackup` and `couchrestore`
@@ -410,14 +421,15 @@ details them.
410
421
  * `2`: invalid CLI option.
411
422
  * `10`: backup source or restore target database does not exist.
412
423
  * `11`: unauthorized credentials for the database.
413
- * `12`: incorrect permissions for the database.
424
+ * `12`: invalid permissions for the database.
414
425
  * `40`: database returned a fatal HTTP error.
415
426
 
416
427
  ### `couchbackup`
417
428
 
418
- * `20`: resume was specified without a log file.
429
+ * `20`: `--resume` without a log file.
419
430
  * `21`: the resume log file does not exist.
420
431
  * `22`: incomplete changes in log file.
432
+ * `23`: the log file already exists, but `--resume` was not used.
421
433
  * `30`: error spooling changes from the database.
422
434
  * `50`: source database does not support `/_bulk_get` endpoint.
423
435
 
@@ -427,13 +439,13 @@ details them.
427
439
 
428
440
  ## Note on attachments
429
441
 
430
- TLDR; If you backup a database that contains attachments you will not be able to restore it.
442
+ TLDR; If you backup a database that has attachments `couchbackup` cannot restore it.
431
443
 
432
- As documented above couchbackup does not support backing up or restoring databases containing documents with attachments.
433
- Attempting to backup a database that includes documents with attachments will appear to succeed. However, the attachment
434
- content will not have been downloaded and the backup file will contain attachment metadata. Consequently any attempt to
435
- restore the backup will result in errors because the attachment metadata will reference attachments that are not present
444
+ As documented above `couchbackup` does not support backing up or restoring databases containing documents with attachments.
445
+ Backing up a database that includes documents with attachments appears to complete successfully. However, the attachment
446
+ content is not downloaded and the backup file contains attachment metadata. So attempts to
447
+ restore the backup result in errors because the attachment metadata references attachments that are not present
436
448
  in the restored database.
437
449
 
438
- It is recommended to store attachments directly in an object store with a link in the JSON document instead of using the
450
+ The recommendation is to store attachments directly in an object store with a link in the JSON document instead of using the
439
451
  native attachment API.