dar-backup 1.0.1__py3-none-any.whl → 1.0.2__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
dar_backup/Changelog.md CHANGED
@@ -1,6 +1,35 @@
1
1
  <!-- markdownlint-disable MD024 -->
2
2
  # dar-backup Changelog
3
3
 
4
+ ## v2-1.0.2 - 2026-01-25
5
+
6
+ ### Added
7
+
8
+ - Streaming restore-test sampling using reservoir sampling to avoid holding full file lists in memory.
9
+ - Configurable command output capture cap (`COMMAND_CAPTURE_MAX_BYTES`, default 100 KB) to limit in-memory stdout/stderr while still logging full output.
10
+ - Streaming list output for `dar-backup --list-contents` and `manager --list-archive-contents` to avoid large in-memory buffers.
11
+ - Test coverage additions for config parsing, util helpers, restore-test sampling edge cases, par2 slice helpers, and get_backed_up_files error paths.
12
+ - Tests for missing source files during restore verification and for par2 generation order after verification.
13
+ - CommandRunner test coverage for sanitize failure notes, text/binary output handling, timeouts, Popen failures, and TTY restore logic.
14
+ - Tests for COMMAND_CAPTURE_MAX_BYTES defaults (0 and 1k) and binary stdout/stderr capture with truncation and log_output disabled.
15
+ - Manager test coverage for create-db guardrails and catalog listing parsing/sorting across runner/subprocess paths.
16
+ - Cleanup now reports PREREQ/POSTREQ failures cleanly and sends Discord failure notifications when configured.
17
+ - New trace logger that always logs at DEBUG and captures stacktraces if they happens. Default max size is 10MB + 1 rollover file.
18
+
19
+ ### Changed
20
+
21
+ - BUGFIX: Ensure existing files are removed before restore verification to prevent false positives.
22
+ - Clears out restore-test directory on program start to ensure a clean slate.
23
+ - Restore-test selection now streams DAR XML listings and samples candidates without loading all entries into RAM.
24
+ - `get_backed_up_files` uses incremental XML parsing to reduce memory use for large archives.
25
+ - Restore verification now logs a warning and continues when a source or restored file is missing during comparison.
26
+ - CommandRunner supports per-command capture cap overrides (disable cap with `capture_output_limit_bytes=-1`).
27
+ - Cleanup now rejects unsafe archive names when `--cleanup-specific-archives` is used to prevent accidental deletions.
28
+ - Removed deprecated PAR2 layout/mode settings and simplified PAR2 cleanup to delete all matching .par2 artifacts.
29
+ - Config templates/docs updated to drop PAR2_LAYOUT/PAR2_MODE references.
30
+ - [Snyk] Python 3.11 required in pyproject.toml. Snyk has flagged a vulnerabilly in an xml parser that requires the bump to 3.11.
31
+ - Config parsing errors now emit concise messages (no stack trace) and trigger Discord failure notifications in CLI tools.
32
+
4
33
  ## v2-1.0.1 - 2026-01-09
5
34
 
6
35
  ### Added
dar_backup/README.md CHANGED
@@ -95,8 +95,8 @@ Version **1.0.0** was reached on October 9, 2025.
95
95
  - [Performance tip due to par2](#performance-tip-due-to-par2)
96
96
  - [.darrc sets -vd -vf (since v0.6.4)](#darrc-sets--vd--vf-since-v064)
97
97
  - [Separate log file for command output](#separate-log-file-for-command-output)
98
+ - [Trace Logging (Debug details)](#trace-logging-debug-details)
98
99
  - [Skipping cache directories](#skipping-cache-directories)
99
- - [Progress bar and current directory](#progress-bar-and-current-directory)
100
100
  - [Shell autocompletion](#shell-autocompletion)
101
101
  - [Use it](#use-it)
102
102
  - [Archive name completion (smart, context-aware)](#archive-name-completion-smart-context-aware)
@@ -123,6 +123,9 @@ Version **1.0.0** was reached on October 9, 2025.
123
123
  - [DISCORD WEBHOOK](#discord-webhook)
124
124
  - [Restore test config](#restore-test-config)
125
125
  - [Par2](#par2-1)
126
+ - [1.0.2](#102)
127
+ - [Trace Logging](#trace-logging)
128
+ - [Command output Capture](#command-output-capture)
126
129
 
127
130
  ## My use case
128
131
 
@@ -791,7 +794,6 @@ ERROR_CORRECTION_PERCENT = 5
791
794
  ENABLED = True
792
795
  # Optional PAR2 configuration
793
796
  # PAR2_DIR = /path/to/par2-store
794
- # PAR2_MODE = per-slice
795
797
  # PAR2_RATIO_FULL = 10
796
798
  # PAR2_RATIO_DIFF = 5
797
799
  # PAR2_RATIO_INCR = 5
@@ -800,7 +802,6 @@ ENABLED = True
800
802
  # Optional per-backup overrides (section name = backup definition)
801
803
  [media-files]
802
804
  PAR2_DIR = /mnt/par2/media-files
803
- PAR2_MODE = per-archive
804
805
  PAR2_RATIO_FULL = 10
805
806
 
806
807
  # scripts to run before the backup to setup the environment
@@ -818,6 +819,7 @@ PAR2 notes:
818
819
  - If `PAR2_DIR` is unset, par2 files are created next to the archive slices (legacy behavior) and no manifest is written
819
820
  - When `PAR2_DIR` is set, dar-backup writes a manifest next to the par2 set:
820
821
  `archive_base.par2.manifest.ini`
822
+ - When generating a par2 set, par2 reads all archive slices before writing any output files; for large backups, this initial read can take hours
821
823
  - Verify or repair using:
822
824
  `par2 verify -B <archive_dir> <par2_set.par2>`
823
825
  `par2 repair -B <archive_dir> <par2_set.par2>`
@@ -1143,6 +1145,25 @@ This happens when the shell splits the quoted string or interprets globs before
1143
1145
 
1144
1146
  > 💡 **Tip:** See [dar's documentation](http://dar.linux.free.fr/doc/man/dar.html#COMMANDS%20AND%20OPTIONS)
1145
1147
 
1148
+ >
1149
+ > 💡💡 **Tip:** To filter all the empty directories away that `dar` emits when listing contents, append this grep:
1150
+ >
1151
+ > ```bash
1152
+ > |grep -vE '\s+d[rwx-]{9}\s'
1153
+ >```
1154
+ >
1155
+ >Example using the grep to discard directory noise from `dar's` output:
1156
+ >
1157
+ > ```bash
1158
+ > dar-backup --list-contents media-files_INCR_2025-05-10 --selection="-I '*Z50*' -X '*.xmp'" | grep -vE '\s+d[rwx-]{9}\s'
1159
+ >[Saved][ ] [-L-][ 0%][X] -rw-rw-r-- user user 26 Mio Fri May 9 11:26:16 2025 home/user/data/2025/2025-05-09-Roskilde-Nordisk-udstilling/Z50_0633.NEF
1160
+ >[Saved][ ] [-L-][ 0%][X] -rw-rw-r-- user user 26 Mio Fri May 9 11:26:16 2025 home/user/data/2025/2025-05-09-Roskilde-Nordisk-udstilling/Z50_0632.NEF
1161
+ >[Saved][ ] [-L-][ 0%][X] -rw-rw-r-- user user 28 Mio Fri May 9 11:09:04 2025 home/user/data/2025/2025-05-09-Roskilde-Nordisk-udstilling/Z50_0631.NEF
1162
+ >[Saved][ ] [-L-][ 0%][X] -rw-rw-r-- user user 29 Mio Fri May 9 11:09:03 2025 home/user/data/2025/2025-05-09-Roskilde-Nordisk-udstilling/Z50_0630.NEF
1163
+ >[Saved][ ] [-L-][ 0%][X] -rw-rw-r-- user user 29 Mio Fri May 9 11:09:03 2025 home/user/data/2025/2025-05-09-Roskilde-Nordisk-udstilling/Z50_0629.NEF
1164
+ >...
1165
+ >```
1166
+
1146
1167
  ### select a directory
1147
1168
 
1148
1169
  Select files and sub directories in `home/user/data/2025/2025-05-09-Roskilde-Nordisk-udstilling`
@@ -1499,23 +1520,29 @@ In order to not clutter that log file with the output of commands being run, a n
1499
1520
 
1500
1521
  The secondary log file can get quite cluttered, if you want to remove the clutter, run the `clean-log`script with the `--file` option, or simply delete it.
1501
1522
 
1502
- ### Skipping cache directories
1523
+ ### Trace Logging (Debug details)
1503
1524
 
1504
- The author uses the `--cache-directory-tagging` option in his [backup definitions](#backup-definition-example).
1525
+ To keep the main log file clean while preserving essential debugging information, `dar-backup` creates a separate trace log file (e.g., `dar-backup.trace.log`) alongside the main log.
1505
1526
 
1506
- The effect is that directories with the [CACHEDIR.TAG](https://bford.info/cachedir/) file are not backed up. Those directories contain content fetched from the net, which is of an ephemeral nature and probably not what you want to back up.
1527
+ - **Main Log (`dar-backup.log`)**: Contains clean, human-readable INFO/ERROR messages. Stack traces are suppressed here.
1528
+ - **Trace Log (`dar-backup.trace.log`)**: Captures ALL messages at `DEBUG` level, including full exception stack traces. Use this file for debugging crashes or unexpected behavior.
1507
1529
 
1508
- If the option is not in the backup definition, the cache directories are backed up as any other.
1530
+ You can configure the rotation of this file in `[MISC]`:
1509
1531
 
1510
- ### Progress bar and current directory
1532
+ ```ini
1533
+ [MISC]
1534
+ # ... other settings ...
1535
+ TRACE_LOG_MAX_BYTES = 10485760 # 10 MB default
1536
+ TRACE_LOG_BACKUP_COUNT = 1 # Keep 1 old trace file (default)
1537
+ ```
1511
1538
 
1512
- If you run dar-backup interactively in a "normal" console on your computer,
1513
- dar-backup displays 2 visual artifacts to show progress.
1539
+ ### Skipping cache directories
1514
1540
 
1515
- 1. a progress bar that fills up and starts over
1516
- 2. a status line showing the directory being backed up. If the directory is big and takes time to backup, the line is not changing, but you will probably know there is a lot to backup.
1541
+ The author uses the `--cache-directory-tagging` option in his [backup definitions](#backup-definition-example).
1517
1542
 
1518
- The indicators are not shown if dar-backup is run from systemd or if it is used in terminal multiplexers like `tmux` or `screen`. So no polluting of journald logs.
1543
+ The effect is that directories with the [CACHEDIR.TAG](https://bford.info/cachedir/) file are not backed up. Those directories contain content fetched from the net, which is of an ephemeral nature and probably not what you want to back up.
1544
+
1545
+ If the option is not in the backup definition, the cache directories are backed up as any other.
1519
1546
 
1520
1547
  ### Shell autocompletion
1521
1548
 
@@ -1700,7 +1727,6 @@ pytest # run the test suite
1700
1727
 
1701
1728
  - Perhaps look into pre-processing backup definitions. As `dar` does not expand env vars
1702
1729
  `dar-backup` could do so and feed the result to `dar`.
1703
- - When run interactively, a progress bar during test and par2 generation would be nice.
1704
1730
  - Look into a way to move the .par2 files away from the `dar` slices, to maximize chance of good redundancy.
1705
1731
  - Add option to dar-backup to use the `dar` option `--fsa-scope none`
1706
1732
 
@@ -2036,10 +2062,44 @@ PAR2_RUN_VERIFY = true
2036
2062
  #
2037
2063
  #[etc]
2038
2064
  # Keep global PAR2 settings but tweak ratios for this backup definition
2039
- # RATIO is i percent number
2065
+ # RATIO is given in percent (%)
2040
2066
  #PAR2_RATIO_FULL = 15
2041
2067
  #PAR2_RATIO_DIFF = 8
2042
2068
  #PAR2_RATIO_INCR = 8
2043
2069
  ```
2044
2070
 
2045
2071
  [Per-backup override test case: `tests/test_par2_overrides.py`](v2/tests/test_par2_overrides.py)
2072
+
2073
+ #### 1.0.2
2074
+
2075
+ ##### Trace Logging
2076
+
2077
+ To support debugging without cluttering the main log file, a secondary trace log is now created (e.g., `dar-backup.trace.log`).
2078
+ This file captures all `DEBUG` level messages and full exception stack traces.
2079
+
2080
+ You can configure its rotation in the `[MISC]` section:
2081
+
2082
+ - `TRACE_LOG_MAX_BYTES`: Max size of the trace log file in bytes. Default is `10485760` (10 MB).
2083
+ - `TRACE_LOG_BACKUP_COUNT`: Number of rotated trace log files to keep. Default is `1`.
2084
+
2085
+ Example:
2086
+
2087
+ ```ini
2088
+ [MISC]
2089
+ TRACE_LOG_MAX_BYTES = 10485760
2090
+ TRACE_LOG_BACKUP_COUNT = 1
2091
+ ```
2092
+
2093
+ ##### Command output Capture
2094
+
2095
+ - New optional `[MISC]` setting: `COMMAND_CAPTURE_MAX_BYTES` (default 102400).
2096
+ - Limits how much stdout/stderr is kept in memory per command while still logging full output.
2097
+ - Set to `0` to disable buffering entirely. Command output is still streamed to dar-backup-commands.log
2098
+ - If set to `0`, the calling function cannot rely on output from the executed command. The exit value is the only result provided.
2099
+
2100
+ Example:
2101
+
2102
+ ```ini
2103
+ [MISC]
2104
+ COMMAND_CAPTURE_MAX_BYTES = 102400
2105
+ ```
dar_backup/__about__.py CHANGED
@@ -1,8 +1,7 @@
1
- __version__ = "1.0.1"
1
+ __version__ = "1.0.2"
2
2
 
3
3
  __author__ = "Per Jensen"
4
4
 
5
5
  __license__ = '''Licensed under GNU GENERAL PUBLIC LICENSE v3, see the supplied file "LICENSE" for details.
6
6
  THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW, not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
7
7
  See section 15 and section 16 in the supplied "LICENSE" file.'''
8
-
dar_backup/clean_log.py CHANGED
@@ -23,13 +23,54 @@ import re
23
23
  import os
24
24
  import sys
25
25
 
26
+ from datetime import datetime
27
+
26
28
  from dar_backup import __about__ as about
27
29
  from dar_backup.config_settings import ConfigSettings
30
+ from dar_backup.util import send_discord_message, get_logger
28
31
 
29
32
  LICENSE = '''Licensed under GNU GENERAL PUBLIC LICENSE v3, see the supplied file "LICENSE" for details.
30
33
  THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW, not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
31
34
  See section 15 and section 16 in the supplied "LICENSE" file.'''
32
35
 
36
+ TIMESTAMP_RE = re.compile(r"^\d{4}-\d{2}-\d{2}\b")
37
+ CLEAN_MESSAGE_PREFIXES = (
38
+ "Inspecting directory",
39
+ "Finished Inspecting",
40
+ "<File",
41
+ "</File",
42
+ "<Attributes",
43
+ "</Attributes",
44
+ "<Directory",
45
+ "</Directory",
46
+ "<Catalog",
47
+ "</Catalog",
48
+ "<Symlink",
49
+ "</Symlink",
50
+ )
51
+
52
+ def _split_level_and_message(line):
53
+ line = line.rstrip("\n")
54
+ if " - " not in line:
55
+ return None, None
56
+
57
+ parts = line.split(" - ")
58
+ if len(parts) >= 3 and TIMESTAMP_RE.match(parts[0].strip()):
59
+ level = parts[1]
60
+ message = " - ".join(parts[2:])
61
+ else:
62
+ level = parts[0]
63
+ message = " - ".join(parts[1:])
64
+
65
+ return level.strip(), message
66
+
67
+ def _should_remove_line(line):
68
+ level, message = _split_level_and_message(line)
69
+ if level != "INFO" or message is None:
70
+ return False
71
+ message = message.lstrip()
72
+ return any(message.startswith(prefix) for prefix in CLEAN_MESSAGE_PREFIXES)
73
+
33
74
  def clean_log_file(log_file_path, dry_run=False):
34
75
  """Removes specific log lines from the given file using a memory-efficient streaming approach."""
35
76
 
@@ -42,7 +83,7 @@ def clean_log_file(log_file_path, dry_run=False):
42
83
  print(f"No read permission for '{log_file_path}'")
43
84
  sys.exit(1)
44
85
 
45
- if not os.access(log_file_path, os.W_OK):
86
+ if not dry_run and not os.access(log_file_path, os.W_OK):
46
87
  print(f"Error: No write permission for '{log_file_path}'")
47
88
  sys.exit(1)
48
89
 
@@ -51,47 +92,20 @@ def clean_log_file(log_file_path, dry_run=False):
51
92
  print(f"Performing a dry run on: {log_file_path}")
52
93
 
53
94
  temp_file_path = log_file_path + ".tmp"
54
-
55
- patterns = [
56
- r"INFO\s*-\s*Inspecting\s*directory",
57
- r"INFO\s*-\s*Finished\s*Inspecting",
58
- r"INFO\s*-\s*<File",
59
- r"INFO\s*-\s*</File",
60
- r"INFO\s*-\s*<Attributes",
61
- r"INFO\s*-\s*</Attributes",
62
- r"INFO\s*-\s*</Directory",
63
- r"INFO\s*-\s*<Directory",
64
- r"INFO\s*-\s*<Catalog",
65
- r"INFO\s*-\s*</Catalog",
66
- r"INFO\s*-\s*<Symlink",
67
- r"INFO\s*-\s*</Symlink",
68
- ]
69
95
 
70
96
  try:
71
- with open(log_file_path, "r", errors="ignore") as infile, open(temp_file_path, "w") as outfile:
97
+ if dry_run:
98
+ with open(log_file_path, "r", errors="ignore") as infile:
99
+ for line in infile:
100
+ if _should_remove_line(line):
101
+ print(f"Would remove: {line.strip()}")
102
+ return
72
103
 
104
+ with open(log_file_path, "r", errors="ignore") as infile, open(temp_file_path, "w") as outfile:
73
105
  for line in infile:
74
- original_line = line # Store the original line before modifying it
75
- matched = False # Track if a pattern is matched
76
-
77
- for pattern in patterns:
78
- if re.search(pattern, line): # Check if the pattern matches
79
- if dry_run:
80
- print(f"Would remove: {original_line.strip()}") # Print full line for dry-run
81
- matched = True # Mark that a pattern matched
82
- break # No need to check other patterns if one matches
83
-
84
- if not dry_run and not matched: # In normal mode, only write non-empty lines
106
+ if not _should_remove_line(line):
85
107
  outfile.write(line.rstrip() + "\n")
86
108
 
87
- if dry_run and matched:
88
- continue # In dry-run mode, skip writing (since we’re just showing)
89
-
90
-
91
- # Ensure the temp file exists before renaming
92
- if not os.path.exists(temp_file_path):
93
- open(temp_file_path, "w").close() # Create an empty file if nothing was written
94
-
95
109
  os.replace(temp_file_path, log_file_path)
96
110
  print(f"Successfully cleaned log file: {log_file_path}")
97
111
 
@@ -131,43 +145,68 @@ def main():
131
145
 
132
146
  args = parser.parse_args()
133
147
 
134
- config_settings = ConfigSettings(os.path.expanduser(os.path.expandvars(args.config_file)))
148
+ try:
149
+ config_settings = ConfigSettings(os.path.expanduser(os.path.expandvars(args.config_file)))
150
+ except Exception as exc:
151
+ msg = f"Config error: {exc}"
152
+ print(msg, file=sys.stderr)
153
+ ts = datetime.now().strftime("%Y-%m-%d_%H:%M")
154
+ send_discord_message(f"{ts} - clean-log: FAILURE - {msg}")
155
+ sys.exit(127)
135
156
 
136
- if not args.file:
137
- args.file = [config_settings.logfile_location]
157
+ try:
158
+ files_to_clean = args.file if args.file else [config_settings.logfile_location]
159
+ logfile_dir = os.path.dirname(os.path.realpath(config_settings.logfile_location))
160
+ validated_files = []
138
161
 
139
- for file_path in args.file:
162
+ for file_path in files_to_clean:
163
+ if not isinstance(file_path, (str, bytes, os.PathLike)):
164
+ print(f"Error: Invalid file path type: {file_path}")
165
+ sys.exit(1)
140
166
 
141
- if ".." in os.path.normpath(file_path).split(os.sep):
142
- print(f"Error: Path traversal is not allowed: '{file_path}'")
143
- sys.exit(1)
167
+ file_path = os.fspath(file_path)
168
+ if isinstance(file_path, bytes):
169
+ file_path = os.fsdecode(file_path)
144
170
 
145
- logfile_dir = os.path.dirname(os.path.realpath(config_settings.logfile_location))
146
- resolved_path = os.path.realpath(file_path)
171
+ if file_path.strip() == "":
172
+ print(f"Error: Invalid empty filename '{file_path}'.")
173
+ sys.exit(1)
147
174
 
148
- if not resolved_path.startswith(logfile_dir + os.sep):
149
- print(f"Error: File is outside allowed directory: '{file_path}'")
150
- sys.exit(1)
175
+ if ".." in os.path.normpath(file_path).split(os.sep):
176
+ print(f"Error: Path traversal is not allowed: '{file_path}'")
177
+ sys.exit(1)
151
178
 
152
- # Validate the file path type and existence
153
- if not isinstance(file_path, (str, bytes, os.PathLike)):
154
- print(f"Error: Invalid file path type: {file_path}")
155
- sys.exit(1)
179
+ resolved_path = os.path.realpath(file_path)
156
180
 
157
- if not os.path.exists(file_path):
158
- print(f"Error: Log file '{file_path}' does not exist.")
159
- sys.exit(1)
181
+ if not resolved_path.startswith(logfile_dir + os.sep):
182
+ print(f"Error: File is outside allowed directory: '{file_path}'")
183
+ sys.exit(1)
160
184
 
161
- if file_path.strip() == "":
162
- print(f"Error: Invalid empty filename '{file_path}'.")
163
- sys.exit(1)
185
+ if not os.path.exists(file_path):
186
+ print(f"Error: Log file '{file_path}' does not exist.")
187
+ sys.exit(1)
164
188
 
165
-
166
- # Run the log file cleaning function
167
- for log_file in args.file:
168
- clean_log_file(log_file, dry_run=args.dry_run)
169
- print(f"Log file '{args.file}' has been cleaned successfully.")
189
+ validated_files.append(file_path)
170
190
 
171
191
 
192
+ # Run the log file cleaning function
193
+ for log_file in validated_files:
194
+ clean_log_file(log_file, dry_run=args.dry_run)
195
+ file_list = ", ".join(validated_files)
196
+ if args.dry_run:
197
+ print(f"Dry run complete for: {file_list}")
198
+ else:
199
+ print(f"Log file '{file_list}' has been cleaned successfully.")
200
+ except Exception as e:
201
+ msg = f"Unexpected error during clean-log: {e}"
202
+ logger = get_logger()
203
+ if logger:
204
+ logger.error(msg, exc_info=True)
205
+ else:
206
+ print(msg, file=sys.stderr)
207
+
208
+ ts = datetime.now().strftime("%Y-%m-%d_%H:%M")
209
+ send_discord_message(f"{ts} - clean-log: FAILURE - {msg}", config_settings=config_settings)
210
+ sys.exit(1)
172
211
  if __name__ == "__main__":
173
212
  main()