@cloudant/couchbackup 2.10.3-SNAPSHOT-239 → 2.10.3-SNAPSHOT-241

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -22,7 +22,7 @@ It comes with a companion command-line utility that can restore the backed up da
22
22
  CouchBackup has some restrictions in the data it's able to backup:
23
23
 
24
24
  * **`couchbackup` does not do CouchDB replication as such, it simply streams through a database's `_changes` feed, and uses `POST /db/_bulk_get` to fetch the documents, storing the documents it finds on disk.**
25
- * **`couchbackup` does not support backing up or restoring databases containing documents with attachments. The recommendation is to store attachments directly in an object store. DO NOT USE THIS TOOL FOR DATABASES CONTAINING ATTACHMENTS.** [Note](#note-on-attachments)
25
+ * **`couchbackup` does not support backing up or restoring databases containing documents with attachments. The recommendation is to store attachments directly in an object store. The "attachments" option is provided as-is and is not supported. This option is for Apache CouchDB only and is experimental. DO NOT USE THIS OPTION WITH IBM Cloudant backups.** [Note](#note-on-attachments)
26
26
 
27
27
  ## Installation
28
28
 
@@ -228,6 +228,7 @@ This tool can script the backup of your databases. Move the backup and log files
228
228
  * `CLOUDANT_IAM_API_KEY` - optional [IAM API key](https://console.bluemix.net/docs/services/Cloudant/guides/iam.html#ibm-cloud-identity-and-access-management)
229
229
  to use to access the Cloudant database instead of user information credentials in the URL. The endpoint used to retrieve the token defaults to
230
230
  `https://iam.cloud.ibm.com/identity/token`, but can be overridden if necessary using the `CLOUDANT_IAM_TOKEN_URL` environment variable.
231
+ * `COUCH_ATTACHMENTS` - _EXPERIMENTAL & UNSUPPORTED_ (see [Note](#note-on-attachments)) if `true` will include attachments as part of the backup or restore process.
231
232
  * `DEBUG` - if set to `couchbackup`, all debug messages print on `stderr` during a backup or restore process
232
233
 
233
234
  _Note:_ Environment variables are only used with the CLI. When
@@ -246,6 +247,7 @@ _Note:_ Environment variables are only used with the CLI. When
246
247
  * `--mode` - same as `COUCH_MODE`
247
248
  * `--iam-api-key` - same as `CLOUDANT_IAM_API_KEY`
248
249
  * `--quiet` - same as `COUCH_QUIET`
250
+ * `--attachments` - _EXPERIMENTAL & UNSUPPORTED_ (see [Note](#note-on-attachments)) same as `COUCH_ATTACHMENTS`
249
251
 
250
252
  ## Using programmatically
251
253
 
@@ -289,6 +291,7 @@ target locations are not required.
289
291
  * `iamApiKey`: see `CLOUDANT_IAM_API_KEY`.
290
292
  * `iamTokenUrl`: optionally used with `iamApiKey` to override the default URL for
291
293
  retrieving IAM tokens.
294
+ * `attachments`: _EXPERIMENTAL & UNSUPPORTED_ (see [Note](#note-on-attachments)), see `CLOUDANT_ATTACHMENTS`.
292
295
 
293
296
  When the backup completes or fails the callback functions gets called with
294
297
  the standard `err, data` parameters.
@@ -352,6 +355,7 @@ target locations are not required.
352
355
  * `iamApiKey`: see `CLOUDANT_IAM_API_KEY`.
353
356
  * `iamTokenUrl`: optionally used with `iamApiKey` to override the default URL for
354
357
  retrieving IAM tokens.
358
+ * `attachments`: _EXPERIMENTAL & UNSUPPORTED_ (see [Note](#note-on-attachments)), see `CLOUDANT_ATTACHMENTS`.
355
359
 
356
360
  When the restore completes or fails the callback functions gets called with
357
361
  the standard `err, data` parameters.
@@ -436,16 +440,25 @@ details them.
436
440
  ### `couchrestore`
437
441
 
438
442
  * `13`: restore target database is not new and empty.
443
+ * `60`: `attachments` option used for backup, but wasn't used for restore.
444
+ * `61`: `attachments` option used for restore, but wasn't used for backup.
439
445
 
440
446
  ## Note on attachments
441
447
 
442
- TLDR; If you backup a database that has attachments `couchbackup` cannot restore it.
448
+ TLDR; If you backup a database that has attachments without using the `attachments` option `couchbackup` can't restore it.
443
449
 
444
450
  As documented above `couchbackup` does not support backing up or restoring databases containing documents with attachments.
451
+
452
+ The recommendation is to store attachments directly in an object store with a link in the JSON document instead of using the
453
+ native attachment API.
454
+
455
+ ### With experimental `attachments` option
456
+
457
+ The `attachments` option is provided as-is and is not supported. This option is for Apache CouchDB only and is experimental. Do not use this option with IBM Cloudant backups.
458
+
459
+ ### Without experimental `attachments` option
460
+
445
461
  Backing up a database that includes documents with attachments appears to complete successfully. However, the attachment
446
462
  content is not downloaded and the backup file contains attachment metadata. So attempts to
447
463
  restore the backup result in errors because the attachment metadata references attachments that are not present
448
464
  in the restored database.
449
-
450
- The recommendation is to store attachments directly in an object store with a link in the JSON document instead of using the
451
- native attachment API.
package/app.js CHANGED
@@ -97,7 +97,9 @@ async function validateOptions(opts) {
97
97
  { key: 'parallelism', type: 'number' },
98
98
  { key: 'requestTimeout', type: 'number' },
99
99
  { key: 'mode', type: 'enum', values: ['full', 'shallow'] },
100
- { key: 'resume', type: 'boolean' }
100
+ { key: 'resume', type: 'boolean' },
101
+ { key: 'quiet', type: 'boolean' },
102
+ { key: 'attachments', type: 'boolean' }
101
103
  ];
102
104
 
103
105
  for (const rule of rules) {
@@ -189,6 +191,14 @@ async function validateLogOnResume(opts) {
189
191
  return true;
190
192
  }
191
193
 
194
+ async function attachmentWarnings(opts) {
195
+ if (opts && opts.attachments) {
196
+ console.warn('WARNING: The "attachments" option is provided as-is and is not supported. ' +
197
+ 'This option is for Apache CouchDB only and is experimental. ' +
198
+ 'Do not use this option with IBM Cloudant.');
199
+ }
200
+ }
201
+
192
202
  /**
193
203
  * Validate arguments.
194
204
  *
@@ -201,7 +211,8 @@ async function validateArgs(url, opts, isBackup = true) {
201
211
  const isIAM = opts && typeof opts.iamApiKey === 'string';
202
212
  const validations = [
203
213
  validateURL(url, isIAM),
204
- validateOptions(opts)
214
+ validateOptions(opts),
215
+ attachmentWarnings(opts)
205
216
  ];
206
217
  if (isBackup) {
207
218
  validations.push(
@@ -326,7 +337,7 @@ module.exports = {
326
337
  } else {
327
338
  // Write a file header including the name, version and mode
328
339
  debug('Will write backup file header.');
329
- metadataToWrite = `${JSON.stringify({ name: pkg.name, version: pkg.version, mode: opts.mode })}\n`;
340
+ metadataToWrite = `${JSON.stringify({ name: pkg.name, version: pkg.version, mode: opts.mode, attachments: opts.attachments })}\n`;
330
341
  }
331
342
  return new Promise((resolve, reject) => {
332
343
  targetStream.write(metadataToWrite, 'utf-8', (err) => {
@@ -37,7 +37,8 @@ try {
37
37
  requestTimeout: program.requestTimeout,
38
38
  resume: program.resume,
39
39
  iamApiKey: program.iamApiKey,
40
- iamTokenUrl: program.iamTokenUrl
40
+ iamTokenUrl: program.iamTokenUrl,
41
+ attachments: program.attachments
41
42
  };
42
43
 
43
44
  // log configuration to console
@@ -32,7 +32,8 @@ try {
32
32
  parallelism: program.parallelism,
33
33
  requestTimeout: program.requestTimeout,
34
34
  iamApiKey: program.iamApiKey,
35
- iamTokenUrl: program.iamTokenUrl
35
+ iamTokenUrl: program.iamTokenUrl,
36
+ attachments: program.attachments
36
37
  };
37
38
 
38
39
  // log configuration to console
@@ -1,4 +1,4 @@
1
- // Copyright © 2023 IBM Corp. All rights reserved.
1
+ // Copyright © 2023, 2024 IBM Corp. All rights reserved.
2
2
  //
3
3
  // Licensed under the Apache License, Version 2.0 (the "License");
4
4
  // you may not use this file except in compliance with the License.
@@ -28,6 +28,9 @@ module.exports = async function * (dbClient, options = {}) {
28
28
  let lastPage = false;
29
29
  let startKey = null;
30
30
  const opts = { db: dbClient.dbName, limit: options.bufferSize, includeDocs: true };
31
+ if (options.attachments === true) {
32
+ opts.attachments = true;
33
+ }
31
34
  do {
32
35
  if (startKey) opts.startKey = startKey;
33
36
  yield dbClient.service.postAllDocs(opts).then(response => {
@@ -0,0 +1,63 @@
1
+ // Copyright © 2024 IBM Corp. All rights reserved.
2
+ //
3
+ // Licensed under the Apache License, Version 2.0 (the "License");
4
+ // you may not use this file except in compliance with the License.
5
+ // You may obtain a copy of the License at
6
+ //
7
+ // http://www.apache.org/licenses/LICENSE-2.0
8
+ //
9
+ // Unless required by applicable law or agreed to in writing, software
10
+ // distributed under the License is distributed on an "AS IS" BASIS,
11
+ // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ // See the License for the specific language governing permissions and
13
+ // limitations under the License.
14
+ 'use strict';
15
+
16
+ const debug = require('debug');
17
+ const mappingDebug = debug('couchbackup:mappings');
18
+
19
+ /**
20
+ * The cloudant-node-sdk helpfully automatically converts the base64 encoded
21
+ * inline attachments into Buffer so the binary data can be used by consuming
22
+ * applicaitons without the need to decode b64.
23
+ * However, in the case of couchbackup we actually want the b64 data so that
24
+ * we can write it in the inline attachment format to the backup file.
25
+ * This class provides the mappings between Buffer and Base64 binary data.
26
+ */
27
+ class Attachments {
28
+ encode(backupBatch) {
29
+ backupBatch.docs.map(doc => {
30
+ if (doc._attachments) {
31
+ Object.entries(doc._attachments).forEach(([k, attachment]) => {
32
+ mappingDebug(`Preparing attachment ${k} for backup.`);
33
+ // Attachment data is a Buffer
34
+ // Base64 encode the attachment data for the backup file
35
+ attachment.data = attachment.data.toString('base64');
36
+ return [k, attachment];
37
+ });
38
+ }
39
+ return doc;
40
+ });
41
+ return backupBatch;
42
+ }
43
+
44
+ decode(restoreBatch) {
45
+ restoreBatch.docs.map(doc => {
46
+ if (doc._attachments) {
47
+ Object.entries(doc._attachments).forEach(([k, attachment]) => {
48
+ mappingDebug(`Preparing attachment ${k} for restore.`);
49
+ // Attachment data is a Base64 string
50
+ // Base64 decode the attachment data into a Buffer
51
+ attachment.data = Buffer.from(attachment.data, 'base64');
52
+ return [k, attachment];
53
+ });
54
+ }
55
+ return doc;
56
+ });
57
+ return restoreBatch;
58
+ }
59
+ }
60
+
61
+ module.exports = {
62
+ Attachments
63
+ };
@@ -15,6 +15,7 @@
15
15
 
16
16
  const { createWriteStream } = require('node:fs');
17
17
  const { pipeline } = require('node:stream/promises');
18
+ const { Attachments } = require('./attachmentMappings.js');
18
19
  const { Backup } = require('./backupMappings.js');
19
20
  const { BackupError } = require('./error.js');
20
21
  const logFileSummary = require('./logfilesummary.js');
@@ -87,13 +88,14 @@ module.exports = function(dbClient, options, targetStream, ee) {
87
88
  })
88
89
  // Create a pipeline of the source streams and the backup mappings
89
90
  .then((srcStreams) => {
90
- const backup = new Backup(dbClient);
91
+ const backup = new Backup(dbClient, options);
91
92
  const postWrite = (backupBatch) => {
92
93
  total += backupBatch.docs.length;
93
94
  const totalRunningTimeSec = (new Date().getTime() - start) / 1000;
94
95
  ee.emit('written', { total, time: totalRunningTimeSec, batch: backupBatch.batch });
95
96
  };
96
97
 
98
+ const mappingStreams = [];
97
99
  const destinationStreams = [];
98
100
  if (options.mode === 'shallow') {
99
101
  // shallow mode writes only to backup file
@@ -108,8 +110,10 @@ module.exports = function(dbClient, options, targetStream, ee) {
108
110
  );
109
111
  } else {
110
112
  // full mode needs to fetch spooled changes and writes a backup file then finally a log file
113
+ mappingStreams.push(...[
114
+ new MappingStream(backup.pendingToFetched, options.parallelism) // fetch the batches at the configured concurrency
115
+ ]);
111
116
  destinationStreams.push(...[
112
- new MappingStream(backup.pendingToFetched, options.parallelism), // fetch the batches at the configured concurrency
113
117
  new WritableWithPassThrough(
114
118
  'backup', // name for logging
115
119
  targetStream, // backup file
@@ -126,8 +130,15 @@ module.exports = function(dbClient, options, targetStream, ee) {
126
130
  ]);
127
131
  }
128
132
 
133
+ if (options.attachments) {
134
+ mappingStreams.push(
135
+ new MappingStream(new Attachments().encode, options.parallelism)
136
+ );
137
+ }
138
+
129
139
  return pipeline(
130
140
  ...srcStreams, // the source streams from the previous block (all docs async generator for shallow or for full either spool changes or resumed log)
141
+ ...mappingStreams, // map from source to destination content
131
142
  ...destinationStreams // the appropriate destination streams for the mode
132
143
  );
133
144
  })
@@ -1,4 +1,4 @@
1
- // Copyright © 2017, 2023 IBM Corp. All rights reserved.
1
+ // Copyright © 2017, 2024 IBM Corp. All rights reserved.
2
2
  //
3
3
  // Licensed under the Apache License, Version 2.0 (the "License");
4
4
  // you may not use this file except in compliance with the License.
@@ -171,8 +171,9 @@ class LogMapper {
171
171
  }
172
172
 
173
173
  class Backup {
174
- constructor(dbClient) {
174
+ constructor(dbClient, options) {
175
175
  this.dbClient = dbClient;
176
+ this.options = options;
176
177
  }
177
178
 
178
179
  /**
@@ -207,11 +208,15 @@ class Backup {
207
208
  pendingToFetched = async(backupBatch) => {
208
209
  mappingDebug(`Fetching batch ${backupBatch.batch}.`);
209
210
  try {
210
- const response = await this.dbClient.service.postBulkGet({
211
+ const bulkGetOpts = {
211
212
  db: this.dbClient.dbName,
212
213
  revs: true,
213
214
  docs: backupBatch.docs
214
- });
215
+ };
216
+ if (this.options.attachments) {
217
+ bulkGetOpts.attachments = true;
218
+ }
219
+ const response = await this.dbClient.service.postBulkGet(bulkGetOpts);
215
220
 
216
221
  mappingDebug(`Good server response for batch ${backupBatch.batch}.`);
217
222
  // create an output array with the docs returned
@@ -1,4 +1,4 @@
1
- // Copyright © 2017, 2023 IBM Corp. All rights reserved.
1
+ // Copyright © 2017, 2024 IBM Corp. All rights reserved.
2
2
  //
3
3
  // Licensed under the Apache License, Version 2.0 (the "License");
4
4
  // you may not use this file except in compliance with the License.
@@ -22,6 +22,7 @@ const { join, normalize } = require('node:path');
22
22
  */
23
23
  function apiDefaults() {
24
24
  return {
25
+ attachments: false,
25
26
  parallelism: 5,
26
27
  bufferSize: 500,
27
28
  requestTimeout: 120000,
@@ -110,6 +111,11 @@ function applyEnvironmentVariables(opts) {
110
111
  if (typeof process.env.CLOUDANT_IAM_TOKEN_URL !== 'undefined') {
111
112
  opts.iamTokenUrl = process.env.CLOUDANT_IAM_TOKEN_URL;
112
113
  }
114
+
115
+ // if we are instructed to be quiet
116
+ if (typeof process.env.COUCH_ATTACHMENTS !== 'undefined' && process.env.COUCH_ATTACHMENTS === 'true') {
117
+ opts.attachments = true;
118
+ }
113
119
  }
114
120
 
115
121
  module.exports = {
package/includes/error.js CHANGED
@@ -1,4 +1,4 @@
1
- // Copyright © 2017, 2023 IBM Corp. All rights reserved.
1
+ // Copyright © 2017, 2024 IBM Corp. All rights reserved.
2
2
  //
3
3
  // Licensed under the Apache License, Version 2.0 (the "License");
4
4
  // you may not use this file except in compliance with the License.
@@ -27,7 +27,9 @@ const codes = {
27
27
  LogFileExists: 23,
28
28
  SpoolChangesError: 30,
29
29
  HTTPFatalError: 40,
30
- BulkGetError: 50
30
+ BulkGetError: 50,
31
+ AttachmentsNotEnabledError: 60,
32
+ AttachmentsMetadataAbsent: 61
31
33
  };
32
34
 
33
35
  class BackupError extends Error {
@@ -33,6 +33,8 @@ function parseBackupArgs() {
33
33
  .version(pkg.version)
34
34
  .description('Backup a CouchDB/Cloudant database to a backup text file.')
35
35
  .usage('[options...]')
36
+ .option('-a, --attachments',
37
+ cliutils.getUsage('*EXPERIMENTAL/UNSUPPORTED*: enable backup of attachments', defaults.attachments))
36
38
  .option('-b, --buffer-size <n>',
37
39
  cliutils.getUsage('number of documents fetched at once', defaults.bufferSize),
38
40
  Number)
@@ -97,6 +99,8 @@ function parseRestoreArgs() {
97
99
  .version(pkg.version)
98
100
  .description('Restore a CouchDB/Cloudant database from a backup text file.')
99
101
  .usage('[options...]')
102
+ .option('-a, --attachments',
103
+ cliutils.getUsage('*EXPERIMENTAL/UNSUPPORTED*: enable restore of attachments', defaults.attachments))
100
104
  .option('-b, --buffer-size <n>',
101
105
  cliutils.getUsage('number of documents restored at once', defaults.bufferSize),
102
106
  Number)
@@ -14,8 +14,9 @@
14
14
  'use strict';
15
15
 
16
16
  const debug = require('debug')('couchbackup:restore');
17
- const { Liner } = require('../includes/liner.js');
18
- const { Restore } = require('../includes/restoreMappings.js');
17
+ const { Attachments } = require('./attachmentMappings.js');
18
+ const { Liner } = require('./liner.js');
19
+ const { Restore } = require('./restoreMappings.js');
19
20
  const { BatchingStream, MappingStream } = require('./transforms.js');
20
21
  const { Writable } = require('node:stream');
21
22
  const { pipeline } = require('node:stream/promises');
@@ -30,7 +31,7 @@ const { pipeline } = require('node:stream/promises');
30
31
  * @returns a promise that resolves when the restore is complete or rejects if it errors
31
32
  */
32
33
  module.exports = function(dbClient, options, readstream, ee) {
33
- const restore = new Restore(dbClient);
34
+ const restore = new Restore(dbClient, options);
34
35
  const start = new Date().getTime(); // restore start time
35
36
  let total = 0; // the total restored
36
37
 
@@ -48,14 +49,29 @@ module.exports = function(dbClient, options, readstream, ee) {
48
49
  }
49
50
  });
50
51
 
51
- return pipeline(
52
+ const batchPreparationStreams = [
52
53
  readstream, // the backup file
53
54
  new Liner(), // line by line
54
55
  new MappingStream(restore.backupLineToDocsArray), // convert line to a docs array
55
56
  new BatchingStream(options.bufferSize, true), // make new arrays of the correct buffer size
56
- new MappingStream(restore.docsToRestoreBatch), // make a restore batch
57
+ new MappingStream(restore.docsToRestoreBatch) // make a restore batch
58
+ ];
59
+ const mappingStreams = [];
60
+ const restoreStreams = [
57
61
  new MappingStream(restore.pendingToRestored, options.parallelism), // do the restore at the desired level of concurrency
58
62
  output // emit restored events
63
+ ];
64
+
65
+ if (options.attachments) {
66
+ mappingStreams.push(
67
+ new MappingStream(new Attachments().decode, options.parallelism)
68
+ );
69
+ }
70
+
71
+ return pipeline(
72
+ ...batchPreparationStreams,
73
+ ...mappingStreams,
74
+ ...restoreStreams
59
75
  ).then(() => {
60
76
  return { total };
61
77
  });
@@ -1,4 +1,4 @@
1
- // Copyright © 2017, 2023 IBM Corp. All rights reserved.
1
+ // Copyright © 2017, 2024 IBM Corp. All rights reserved.
2
2
  //
3
3
  // Licensed under the Apache License, Version 2.0 (the "License");
4
4
  // you may not use this file except in compliance with the License.
@@ -28,8 +28,9 @@ class Restore {
28
28
  suppressAllBrokenJSONErrors = true;
29
29
  backupMode;
30
30
 
31
- constructor(dbClient) {
31
+ constructor(dbClient, options) {
32
32
  this.dbClient = dbClient;
33
+ this.options = options;
33
34
  this.batchCounter = 0;
34
35
  }
35
36
 
@@ -65,7 +66,7 @@ class Restore {
65
66
  return lineAsJson;
66
67
  } else if (backupLine.lineNumber === 1 && lineAsJson.name && lineAsJson.version && lineAsJson.mode) {
67
68
  // First line is metadata.
68
- mappingDebug(`Parsed backup file metadata ${lineAsJson.name} ${lineAsJson.version} ${lineAsJson.mode}.`);
69
+ mappingDebug(`Parsed backup file metadata ${lineAsJson.name} ${lineAsJson.version} ${lineAsJson.mode} ${lineAsJson.attachments}.`);
69
70
  // This identifies a version of 2.10.0 or newer that wrote the backup file.
70
71
  // Set the mode that was used for the backup file.
71
72
  this.backupMode = lineAsJson.mode;
@@ -73,6 +74,16 @@ class Restore {
73
74
  // were associated wiht a resume, so unset the ignore flag.
74
75
  this.suppressAllBrokenJSONErrors = false;
75
76
  // Later we may add other version/feature specific toggles here.
77
+ if (lineAsJson.attachments === true) {
78
+ if (!this.options.attachments) {
79
+ // Error out if trying to restore attachments without the option
80
+ throw new BackupError('AttachmentsNotEnabledError', 'To restore a backup file with attachments, enable the attachments option.');
81
+ }
82
+ } else {
83
+ if (this.options.attachments) {
84
+ throw new BackupError('AttachmentsMetadataAbsent', 'Cannot restore with attachments because the backup file was not created with the attachments option.');
85
+ }
86
+ }
76
87
  } else if (lineAsJson.marker && lineAsJson.marker === marker) {
77
88
  mappingDebug(`Resume marker on line ${backupLine.lineNumber} of backup file.`);
78
89
  } else {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@cloudant/couchbackup",
3
- "version": "2.10.3-SNAPSHOT-239",
3
+ "version": "2.10.3-SNAPSHOT-241",
4
4
  "description": "CouchBackup - command-line backup utility for Cloudant/CouchDB",
5
5
  "homepage": "https://github.com/IBM/couchbackup",
6
6
  "repository": {
@@ -48,7 +48,7 @@
48
48
  "eslint-plugin-promise": "6.6.0",
49
49
  "http-proxy": "1.18.1",
50
50
  "mocha": "10.7.3",
51
- "nock": "13.5.4",
51
+ "nock": "13.5.5",
52
52
  "tail": "2.2.6",
53
53
  "uuid": "10.0.0"
54
54
  },