lsb-tool 2.0.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,177 @@
1
+ Metadata-Version: 2.4
2
+ Name: lsb-tool
3
+ Version: 2.0.0
4
+ Summary: Embed and extract files hidden inside PNG images using N-bit LSB steganography.
5
+ License-Expression: MIT
6
+ Keywords: steganography,lsb,png,embed,security
7
+ Classifier: Programming Language :: Python :: 3
8
+ Classifier: Programming Language :: Python :: 3.8
9
+ Classifier: Programming Language :: Python :: 3.9
10
+ Classifier: Programming Language :: Python :: 3.10
11
+ Classifier: Programming Language :: Python :: 3.11
12
+ Classifier: Programming Language :: Python :: 3.12
13
+ Classifier: Operating System :: OS Independent
14
+ Classifier: Topic :: Security :: Cryptography
15
+ Classifier: Topic :: Multimedia :: Graphics
16
+ Requires-Python: >=3.8
17
+ Description-Content-Type: text/markdown
18
+ Requires-Dist: Pillow>=8.0
19
+ Requires-Dist: numpy>=1.20
20
+ Provides-Extra: dev
21
+ Requires-Dist: pytest>=7.0; extra == "dev"
22
+
23
+ # LSB Tool
24
+
25
+ Embed files inside a PNG image or extract previously hidden files using
26
+ **N-bit LSB steganography** secured by a password.
27
+
28
+ Files are scattered across pixels in a password-derived pseudo-random order so
29
+ the hidden data is not readable by standard steganalysis tools that scan
30
+ sequential pixel runs.
31
+
32
+ ---
33
+
34
+ ## Features
35
+
36
+ | | |
37
+ |-|-|
38
+ | **All image modes** | 1-bit, 8-bit grayscale/palette, 16-bit grayscale, RGB, RGBA |
39
+ | **N-bit embedding** | Store 1–N bits per channel (N limited by channel bit-depth) |
40
+ | **Multiple files** | Embed any number of files in a single carrier image |
41
+ | **Filename storage** | Optionally preserve original filenames on extraction |
42
+ | **Error codes** | Every failure exits with a distinct code — see [ERRORS.md](ERRORS.md) |
43
+
44
+ ---
45
+
46
+ ## Requirements
47
+
48
+ - Python 3.8+
49
+ - Pillow ≥ 8.0
50
+ - NumPy ≥ 1.20
51
+
52
+ ```bash
53
+ pip install -r requirements.txt
54
+ ```
55
+
56
+ ---
57
+
58
+ ## Usage
59
+
60
+ ### Embed
61
+
62
+ ```bash
63
+ python main.py -E -i carrier.png -p password -f file1.txt file2.bin
64
+ ```
65
+
66
+ Options:
67
+
68
+ | Flag | Description |
69
+ |------|-------------|
70
+ | `-i` | Carrier image (any PIL-supported format; output is always PNG) |
71
+ | `-p` | Password used to derive the pixel-shuffle seed and verification hash |
72
+ | `-f` | One or more files to embed |
73
+ | `-l` | Bits per channel to use — **embedding depth** (default `1`, max depends on image type) |
74
+ | `-n` | Filename field length in bytes 0–255 (default `0` = filenames not stored) |
75
+ | `-v` | Print capacity and hash diagnostics after the operation |
76
+
77
+ Output: `<original_name>_embedded.png` in the current directory.
78
+
79
+ ### Extract
80
+
81
+ ```bash
82
+ python main.py -e -i carrier_embedded.png -p password
83
+ ```
84
+
85
+ Extracted files are written to the current directory. If `-n` was used during
86
+ embedding, the original filenames are restored; otherwise files are named
87
+ `extracted_file_0`, `extracted_file_1`, …
88
+
89
+ ---
90
+
91
+ ## Embedding depth (`-l`)
92
+
93
+ Higher depth stores more bits per channel, multiplying capacity roughly
94
+ proportionally, but making changes more visible to the human eye.
95
+
96
+ | Image type | Max depth | Capacity formula (bytes) |
97
+ |------------|-----------|--------------------------|
98
+ | 1-bit (`1`) | 1 | `(pixels − 5) × 1 / 8` |
99
+ | 8-bit (`L`, `LA`, `RGB`, `RGBA`, `P`) | 8 | `(pixels − preamble) × channels × depth / 8` |
100
+ | 16-bit grayscale (`I;16`) | 16 | same formula |
101
+
102
+ If you request a depth greater than the image supports, the tool warns and
103
+ clamps to the maximum automatically.
104
+
105
+ ---
106
+
107
+ ## Examples
108
+
109
+ ```bash
110
+ # Embed two files at depth 1 (default)
111
+ python main.py -E -i photo.png -p hunter2 -f report.pdf archive.zip
112
+
113
+ # Extract (restores original files)
114
+ python main.py -e -i photo_embedded.png -p hunter2
115
+
116
+ # Embed with filename storage and depth 4
117
+ python main.py -E -i photo.png -p hunter2 -f secret.txt -l 4 -n 32
118
+
119
+ # Extract — file is written as "secret.txt" instead of "extracted_file_0"
120
+ python main.py -e -i photo_embedded.png -p hunter2
121
+
122
+ # Check capacity (verbose)
123
+ python main.py -E -i photo.png -p hunter2 -f data.bin -v
124
+ ```
125
+
126
+ ---
127
+
128
+ ## Image format note
129
+
130
+ Always use PNG for the carrier image. Lossy formats (JPEG, WebP) alter pixel
131
+ values after saving and will destroy the hidden data. The tool always saves
132
+ output as PNG regardless of the input format.
133
+
134
+ ---
135
+
136
+ ## Running tests
137
+
138
+ ```bash
139
+ pip install pytest
140
+ pytest tests/ -v
141
+ ```
142
+
143
+ ---
144
+
145
+ ## Error codes
146
+
147
+ All errors exit with a specific code and print `[Exx] …` to stderr.
148
+ See [ERRORS.md](ERRORS.md) for the full reference.
149
+
150
+ | Code | Meaning |
151
+ |------|---------|
152
+ | E02 | No mode selected (`-E` or `-e` required) |
153
+ | E03 | Carrier image not found |
154
+ | E04 | A file to embed was not found |
155
+ | E05 | Files too large for this image at the chosen depth |
156
+ | E06 | Wrong password or no embedded data |
157
+ | E07 | Image file is corrupt or unsupported |
158
+ | E08 | Cannot write an output file |
159
+ | E10 | Internal error (bug) |
160
+
161
+ ---
162
+
163
+ ## How it works
164
+
165
+ 1. **Key derivation** — The password and image dimensions are hashed (SHA-256,
166
+ iterated `width × height` times) to produce a 64-character hex digest.
167
+ 2. **Pixel shuffle** — A NumPy PCG64 RNG seeded from the digest shuffles all
168
+ pixel coordinates into a pseudo-random order.
169
+ 3. **Preamble** — The first `⌈13/channels⌉` shuffled pixels store a 13-bit
170
+ header at LSB-only depth: 5 bits for the embedding depth, 8 bits for the
171
+ filename-field length.
172
+ 4. **Main data** — Remaining shuffled pixels carry the payload at the chosen
173
+ depth using the formula `bit k → pixel = k ÷ (channels × depth)`,
174
+ `channel = (k ÷ depth) mod channels`, `bit_pos = k mod depth`.
175
+ 5. **Verification** — The first 512 bits of the main region store the
176
+ password-derived SHA-256 hash as ASCII. Extraction fails with E06 if it
177
+ does not match.
@@ -0,0 +1,11 @@
1
+ main.py,sha256=yKZTyyUsuWmn9OklwSqJf3P2TnaiYUlGiCWTYnD_2-4,3578
2
+ util/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
3
+ util/container.py,sha256=2qB1GkbmkJIA_Wtt8hFHmdiPhDVru5j3RPIeRR30rn4,7626
4
+ util/errors.py,sha256=8hqXVmJHmGeRqEBndb-bN5doP6fxXjkqafcgoO3offE,1430
5
+ util/security.py,sha256=Yuo7U4393_hVEZy_Gl3wr2N7y9PmhQNGiy461xCqNT4,2140
6
+ util/utils.py,sha256=_Ja2uSxu5F3nqU5k8Zo28vWe2sj3gNcEOLID4HdGGXc,12665
7
+ lsb_tool-2.0.0.dist-info/METADATA,sha256=QlrgBmAeO7i5d6V8HTJ8At1kBC0SJPK4Hr0-uZbzeI8,5511
8
+ lsb_tool-2.0.0.dist-info/WHEEL,sha256=aeYiig01lYGDzBgS8HxWXOg3uV61G9ijOsup-k9o1sk,91
9
+ lsb_tool-2.0.0.dist-info/entry_points.txt,sha256=hCr-1TrTMg-pknLfhqjzhF4D31O4gpW8xuHam3GaoIQ,39
10
+ lsb_tool-2.0.0.dist-info/top_level.txt,sha256=bCJ8VhbgyTOrXkuBR6DxWcF940BOuPvpaBhclPUU-5g,10
11
+ lsb_tool-2.0.0.dist-info/RECORD,,
@@ -0,0 +1,5 @@
1
+ Wheel-Version: 1.0
2
+ Generator: setuptools (82.0.1)
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
5
+
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ lsb-tool = main:main
@@ -0,0 +1,2 @@
1
+ main
2
+ util
main.py ADDED
@@ -0,0 +1,105 @@
1
+ #! /usr/bin/env python3
2
+ """LSB Steganography Tool – CLI entry point.
3
+
4
+ Embeds files inside a PNG image or extracts previously embedded files from one,
5
+ using Least Significant Bit (LSB) steganography secured by a password.
6
+
7
+ Usage:
8
+ Embed: main.py -E -i <image> -p <password> -f <file1> [file2 ...] [-l <level>] [-n <len>] [-v]
9
+ Extract: main.py -e -i <image> -p <password> [-v]
10
+
11
+ Arguments:
12
+ -i Source image (any PIL-supported format; output is always PNG).
13
+ -p Password used to derive the shuffle seed and header hash.
14
+ -f One or more files to embed (embed mode only).
15
+ -l LSB depth level 1–N (default 1; N depends on image type — see ERRORS.md).
16
+ -n Filename field length in bytes 0–255 (default 0; 0 = no filenames stored).
17
+ -e Extract mode: read and save files hidden in the image.
18
+ -E Embed mode: hide files inside the image.
19
+ -v Verbose: print capacity and hash diagnostics after the operation.
20
+
21
+ Output:
22
+ Embed: <original_name>_embedded.png written to the current directory.
23
+ Extract: Files written to the current directory, named by stored filename or
24
+ extracted_file_0, extracted_file_1, … if no filename was stored.
25
+
26
+ Error codes are documented in ERRORS.md.
27
+ """
28
+
29
+ import sys
30
+ from argparse import ArgumentParser
31
+ from pathlib import Path
32
+
33
+ from util.container import Container
34
+ from util.errors import die, E
35
+ from util.utils import status
36
+
37
+
38
+ def _build_parser():
39
+ parser = ArgumentParser(
40
+ description="Embed or extract files hidden inside a PNG image using LSB steganography."
41
+ )
42
+ parser.add_argument("-i", type=Path, help="Image to embed files in / extract from",
43
+ required=True)
44
+ parser.add_argument("-p", type=str, help="Embedding password", required=True)
45
+ parser.add_argument("-f", type=Path, help="File(s) to embed (embed mode only)", nargs="+")
46
+ parser.add_argument("-l", type=int,
47
+ help="Embedding depth in bits per channel (default 1)", default=1)
48
+ parser.add_argument("-n", "--max-name-len", type=int,
49
+ help="Filename field length in bytes 0–255 (default 0 = no names)",
50
+ default=0)
51
+ parser.add_argument("-e", help="Extract files hidden in the image", action="store_true")
52
+ parser.add_argument("-E", help="Embed files into the image", action="store_true")
53
+ parser.add_argument("-v", help="Print capacity and hash diagnostics", action="store_true")
54
+ return parser
55
+
56
+
57
+ def main():
58
+ parser = _build_parser()
59
+ args = parser.parse_args()
60
+
61
+ if not args.i.exists():
62
+ die(E.IMAGE_NOT_FOUND, f"Image file '{args.i}' not found.")
63
+
64
+ if args.E: # Embedding
65
+ missing = [f for f in args.f if not f.exists()]
66
+ if missing:
67
+ for f in missing:
68
+ print(f"[E{E.EMBED_FILE_NOT_FOUND:02d}] File to embed not found: '{f}'",
69
+ file=sys.stderr)
70
+ sys.exit(E.EMBED_FILE_NOT_FOUND)
71
+
72
+ image = Container(args.i, args.p)
73
+ image.set_level(args.l)
74
+ image.set_max_name_len(args.max_name_len)
75
+
76
+ for f in args.f:
77
+ image.add_file(f)
78
+
79
+ print(f"Embedding {len(args.f)} file(s) into '{args.i}' at depth {image.depth}...",
80
+ flush=True)
81
+ image.embed()
82
+
83
+ out_name = "".join(str(args.i).split('/')[-1].split('.')[:-1]) + "_embedded.png"
84
+ print(f"Saving '{out_name}'...", flush=True)
85
+ image.save()
86
+ print("Done.")
87
+
88
+ if args.v:
89
+ status(image)
90
+
91
+ elif args.e: # Extraction
92
+ image = Container(args.i, args.p)
93
+
94
+ print(f"Extracting from '{args.i}'...", flush=True)
95
+ image.extract()
96
+
97
+ if args.v:
98
+ status(image)
99
+
100
+ else:
101
+ die(E.INVALID_ARGS, "No mode selected. Use -E to embed files or -e to extract.")
102
+
103
+
104
+ if __name__ == "__main__":
105
+ main()
util/__init__.py ADDED
File without changes
util/container.py ADDED
@@ -0,0 +1,197 @@
1
+ from util.security import get_hash, get_generator, verify_hash
2
+ from util.utils import embed_files, test_fit, extract_hash, extract_files, read_preamble
3
+ from util.errors import die, E
4
+ from PIL import Image, UnidentifiedImageError
5
+ from math import prod, ceil
6
+ from pathlib import Path
7
+
8
+ class Container:
9
+ """Wraps a PIL image and provides LSB steganography embed/extract operations.
10
+
11
+ On construction the image is opened, its pixels are loaded into a mutable
12
+ pixel-access object, and a password-derived shuffle of all pixel coordinates
13
+ is computed. Every subsequent read or write uses that shuffled order so that
14
+ data is scattered pseudo-randomly across the image rather than written
15
+ sequentially from the top-left corner.
16
+
17
+ Layout (in the shuffled pixel stream):
18
+
19
+ Preamble (first ceil(13/channels) pixels, always LSB/depth=1):
20
+ bits 0-4 : depth (5-bit uint, 1–32; max valid value depends on image mode)
21
+ bits 5-12 : max_name_len (8-bit uint)
22
+
23
+ Main data (remaining pixels, depth=N bits per channel):
24
+ [0 – 511] : SHA-256 hash of the password (64 ASCII chars × 8 bits)
25
+ [512 – 543] : number of embedded files (32-bit uint)
26
+ per file:
27
+ 32 bits : file size in bytes
28
+ max_name_len*8 bits : filename, UTF-8, zero-padded (omitted if max_name_len=0)
29
+ [remainder] : raw file bytes, concatenated
30
+ """
31
+
32
+ supported_modes = ['1', 'I', 'I;16', 'L', 'LA', 'P', 'RGB', 'RGBA']
33
+
34
+ # Number of channels per pixel (= max LSBs at depth=1).
35
+ bits = { '1':1, 'I':1, 'I;16':1, 'L':1, 'P':1, #grayscale images
36
+ 'LA':2, #grayscale with alpha channel
37
+ 'RGB':3, #true color
38
+ 'RGBA':4 #true color with alpha channel
39
+ }
40
+
41
+ # Bit-depth of each channel value (stored here for reference / future use).
42
+ depths = { '1':1, #1-bit pixel values
43
+ 'L':8, 'P':8, 'LA':8, 'RGB':8, 'RGBA':8, #8-bit pixel values
44
+ 'I;16':16, #16-bit pixel values
45
+ 'I':32 #32-bit pixel values
46
+ }
47
+
48
+ def __init__(self, filename, password):
49
+ """Open an image file and prepare it for steganographic operations.
50
+
51
+ If the image mode is not in supported_modes it is converted to RGBA,
52
+ which is the most capable supported mode (4 bits/pixel).
53
+
54
+ The pixel coordinate list is shuffled using an RNG seeded from the
55
+ password hash so that the storage positions are unpredictable without
56
+ the password.
57
+
58
+ Args:
59
+ filename (str | Path): Path to the source image file.
60
+ password (str): Password used to derive the hash and shuffle seed.
61
+ """
62
+ #Image elements
63
+ self.filename = str(filename)
64
+ try:
65
+ self.img = Image.open(filename)
66
+ except (UnidentifiedImageError, OSError) as exc:
67
+ die(E.IMAGE_LOAD_ERROR,
68
+ f"Could not open '{filename}' as an image. "
69
+ f"The file may be corrupt, truncated, or in an unsupported format.")
70
+
71
+ #Checking if the image is supported as it is, converting otherwise
72
+ if self.img.mode not in self.supported_modes:
73
+ self.img = self.img.convert("RGBA") #Best type for embedding
74
+
75
+ self.size = self.img.size # (width, height) tuple
76
+ self.pixels = self.img.load() # Mutable PixelAccess object
77
+
78
+ #Security elements
79
+ self.hash = get_hash(password, self)
80
+ self.rng = get_generator(self.hash)
81
+ # Build a flat list of (x, y) coordinates for every pixel, then shuffle
82
+ # it so data is written in a password-dependent pseudo-random order.
83
+ self.pixel_values = [(i//self.size[1], i%self.size[1]) for i in range(prod(self.size))]
84
+ self.rng.shuffle(self.pixel_values)
85
+
86
+ #Key elements
87
+ self.files = [] # List of bytes objects, one per file to embed/extract
88
+ self.filenames = [] # Original filenames, one per file
89
+ self.depth = 1 # LSB embedding depth (1–8)
90
+ self.max_name_len = 0 # Filename field length in bytes (0 = no filename stored)
91
+
92
+ def add_file(self, filename):
93
+ """Read a file from disk and queue it for embedding.
94
+
95
+ Args:
96
+ filename (str | Path): Path to the file to embed.
97
+ """
98
+ with open(filename, "rb") as f:
99
+ self.files.append(f.read())
100
+ self.filenames.append(Path(filename).name)
101
+
102
+ def set_level(self, level):
103
+ """Set the LSB embedding depth level.
104
+
105
+ The maximum allowed depth is the bit-depth of the image's channel type:
106
+ 1-bit ('1'): depth capped at 1
107
+ 8-bit ('L','P','LA',etc.): depth capped at 8
108
+ 16-bit ('I;16'): depth capped at 16
109
+ 32-bit ('I'): depth capped at 32
110
+
111
+ Deeper levels store more bits per channel, increasing capacity at the
112
+ cost of more visible image degradation.
113
+
114
+ Args:
115
+ level (int): Desired bit depth (clamped to 1 – channel bit-depth).
116
+ """
117
+ max_depth = self.depths.get(self.img.mode, 8)
118
+ clamped = max(1, min(max_depth, level))
119
+ if clamped != level:
120
+ print(f"Warning: depth {level} is not supported by this image type "
121
+ f"(max {max_depth} for '{self.img.mode}' mode). Using depth={clamped}.")
122
+ self.depth = clamped
123
+
124
+ def set_max_name_len(self, n):
125
+ """Set the per-file filename field length in bytes (0–255).
126
+
127
+ Set to 0 (the default) to omit filenames entirely; extracted files
128
+ will be named extracted_file_0, extracted_file_1, etc.
129
+
130
+ Args:
131
+ n (int): Filename field size in bytes (clamped to 0–255).
132
+ """
133
+ self.max_name_len = max(0, min(255, n))
134
+
135
+ def embed(self):
136
+ """Write all queued files into the image using LSB steganography.
137
+
138
+ Verifies the payload fits before writing; exits with E05 if not.
139
+ """
140
+ channels = self.bits[self.img.mode]
141
+ test_fit(self, channels, self.depth) # exits via die() if too large
142
+ embed_files(self, channels, self.depth)
143
+
144
+ def extract(self):
145
+ """Extract previously embedded files from the image and write them to disk.
146
+
147
+ Reads the preamble to determine the embedding depth and filename field
148
+ length, then verifies the header hash. If the hash matches, reads all
149
+ embedded files and writes them to the current directory using either the
150
+ stored filename (if max_name_len > 0) or extracted_file_<index>.
151
+ """
152
+ channels = self.bits[self.img.mode]
153
+ preamble_pixels = ceil(13 / channels)
154
+
155
+ depth, max_name_len = read_preamble(self)
156
+ self.depth = depth
157
+ self.max_name_len = max_name_len
158
+
159
+ hash_string = extract_hash(self, channels, depth, preamble_pixels)
160
+
161
+ if verify_hash(self, hash_string):
162
+ extract_files(self, channels, depth, preamble_pixels, max_name_len)
163
+ written = []
164
+ for i, (file_data, filename) in enumerate(zip(self.files, self.filenames)):
165
+ out_name = filename if filename else f"extracted_file_{i}"
166
+ try:
167
+ with open(out_name, "wb") as f:
168
+ f.write(file_data)
169
+ except OSError:
170
+ die(E.WRITE_ERROR,
171
+ f"Could not write '{out_name}'. "
172
+ f"Check that you have write permission in the current directory.")
173
+ written.append(out_name)
174
+ print(f"Extracted {len(written)} file(s): {written}")
175
+ else:
176
+ die(E.WRONG_PASSWORD,
177
+ "Could not verify the embedded data. "
178
+ "The password may be incorrect, or this image contains no hidden files.")
179
+
180
+ def save(self):
181
+ """Save the modified image as a PNG file in the current directory.
182
+
183
+ The output filename is derived from the original filename by stripping
184
+ the extension and appending ``_embedded.png``.
185
+ PNG is used to guarantee lossless storage of the modified pixel values;
186
+ a lossy format (e.g. JPEG) would destroy the hidden data.
187
+ """
188
+ new_file = "".join(self.filename.split('/')[-1].split('.')[:-1])
189
+ self.img.save(new_file+"_embedded.png", format="png")
190
+
191
+ def reset_rng(self):
192
+ """Re-seed the RNG to its initial state using the stored hash.
193
+
194
+ Useful if you need to replay the pixel shuffle from the beginning
195
+ (e.g. for a second pass over the image).
196
+ """
197
+ self.rng = get_generator(self.hash)
util/errors.py ADDED
@@ -0,0 +1,35 @@
1
+ """Exit codes and error-reporting for the LSB steganography tool.
2
+
3
+ All user-visible errors are routed through ``die()``, which prints a
4
+ human-readable message prefixed with the error code and exits with that
5
+ code as the process exit status. Codes are documented in full in
6
+ ERRORS.md at the project root.
7
+ """
8
+
9
+ import sys
10
+
11
+
12
+ class E:
13
+ """Exit-code constants. See ERRORS.md for the authoritative reference."""
14
+ INVALID_ARGS = 2 # Bad or missing CLI arguments (shared with argparse)
15
+ IMAGE_NOT_FOUND = 3 # Source image file does not exist on disk
16
+ EMBED_FILE_NOT_FOUND = 4 # A file queued for embedding does not exist
17
+ IMAGE_TOO_SMALL = 5 # Payload does not fit at the chosen depth
18
+ WRONG_PASSWORD = 6 # Hash mismatch during extraction
19
+ IMAGE_LOAD_ERROR = 7 # PIL cannot open / decode the image
20
+ WRITE_ERROR = 8 # Cannot write an output file
21
+ INTERNAL_ERROR = 10 # Should never be reached; indicates a bug
22
+
23
+
24
+ def die(code, message):
25
+ """Print a user-friendly error and exit with ``code``.
26
+
27
+ Output is written to stderr so it does not pollute piped stdout.
28
+ The ``[Exx]`` prefix lets users cross-reference ERRORS.md quickly.
29
+
30
+ Args:
31
+ code (int): One of the ``E.*`` constants.
32
+ message (str): Plain-language description of what went wrong.
33
+ """
34
+ print(f"[E{code:02d}] {message}", file=sys.stderr)
35
+ sys.exit(code)
util/security.py ADDED
@@ -0,0 +1,61 @@
1
+ import hmac
2
+ from hashlib import sha256
3
+ from numpy.random import default_rng
4
+ from math import prod
5
+
6
+ def get_hash(password, img):
7
+ """Derive a deterministic SHA-256 hash from a password and image dimensions.
8
+
9
+ The password is salted with the image size string, then hashed repeatedly
10
+ prod(width * height) times to make brute-force attacks expensive. A final
11
+ SHA-256 pass produces the output hex digest.
12
+
13
+ Args:
14
+ password (str): The user-supplied embedding/extraction password.
15
+ img: A Container instance (must have a .size attribute of (width, height)).
16
+
17
+ Returns:
18
+ str: A 64-character lowercase hex SHA-256 digest used as both the
19
+ header verification token and the RNG seed.
20
+ """
21
+ password = password+str(img.size)
22
+ rounds = prod(img.size)
23
+ for i in range(rounds):
24
+ password = sha256(password.encode()).hexdigest()
25
+ return sha256(password.encode()).hexdigest()
26
+
27
+ def get_generator(seed):
28
+ """Create a NumPy random Generator seeded from a hex hash string.
29
+
30
+ The hex digest is converted to an integer so it can serve as a numeric seed
31
+ for numpy's default_rng (PCG64). The same seed always produces the same
32
+ pixel shuffle order, which is required for both embedding and extraction.
33
+
34
+ Args:
35
+ seed (str): A hex string (e.g. the output of get_hash).
36
+
37
+ Returns:
38
+ numpy.random.Generator: A seeded RNG used to shuffle pixel coordinates.
39
+ """
40
+ return default_rng(seed=int(seed, 16))
41
+
42
+ def verify_hash(container, hsh):
43
+ """Check whether an extracted header hash matches the container's expected hash.
44
+
45
+ Called during extraction to confirm that the correct password was supplied
46
+ and that the image actually contains embedded data.
47
+
48
+ Args:
49
+ container: A Container instance with a .hash attribute.
50
+ hsh (str): The hash string read back from the image header.
51
+
52
+ Returns:
53
+ bool: True if the hashes match (correct password / data present).
54
+ """
55
+ try:
56
+ return hmac.compare_digest(container.hash, hsh)
57
+ except TypeError:
58
+ # hsh may contain non-ASCII characters when the password is wrong and
59
+ # the extracted bit-stream decodes to arbitrary Unicode code points.
60
+ # Any such mismatch is definitively a failed verification.
61
+ return False
util/utils.py ADDED
@@ -0,0 +1,370 @@
1
+ from math import prod, ceil
2
+ from pathlib import Path
3
+ from util.errors import die, E
4
+
5
+ # Header size lookup table (kept for future reference / validation use).
6
+ # Each entry is [hash_bits, file_count_bits, ???, total_bits] for a given mode.
7
+ # Not actively used in the current implementation; computed values in
8
+ # embed_files / test_fit are the authoritative source.
9
+ header_size = {
10
+ **dict.fromkeys(['1', 'I', 'I;16', 'L', 'P'], [504, 32, 32, 568]), #For grayscale and black/white images (1 channel)
11
+ **dict.fromkeys(['LA'], [504, 32, 33, 569]), #For grayscale with alpha channel (2 channels)
12
+ **dict.fromkeys(['RGB', 'RGBA'], [504, 32, 34, 570]) #For true color (3 and 4 channels)
13
+ }
14
+
15
+
16
+
17
+ def inject_pixel(orig, adder, bit_pos=0):
18
+ """Write one message bit into each channel of a single pixel at the given bit position.
19
+
20
+ For each channel: if the message bit is 1, bit ``bit_pos`` is set; if 0 it is
21
+ cleared. ``bit_pos=0`` (the default) modifies the LSB, reproducing the
22
+ original behaviour.
23
+
24
+ When ``adder`` has fewer elements than ``orig`` (e.g. embedding 3-bit data
25
+ into an RGBA pixel), the missing channels are copied from the original so
26
+ those channels are left untouched.
27
+
28
+ Args:
29
+ orig (tuple | int): The original pixel value. A tuple for multi-channel
30
+ modes (RGB, RGBA, LA), or a plain int for grayscale.
31
+ adder (list[int]): Message bits to write, one per channel (0 or 1).
32
+ bit_pos (int): Which bit of each channel to modify (0 = LSB).
33
+
34
+ Returns:
35
+ tuple: The modified pixel value with the target bit updated.
36
+ """
37
+ adder = list(adder) # prevent in-place mutation of caller's list
38
+ mask = 1 << bit_pos
39
+ inject = lambda x: (x[0] | mask) if x[1] else (x[0] & ~mask)
40
+
41
+ try:
42
+ final = list(orig)
43
+ except:
44
+ final = [orig]
45
+
46
+ if len(final) == len(adder):
47
+ final = list(map(inject, [(final[i], adder[i]) for i in range(len(adder))]))
48
+ return tuple(final)
49
+ elif len(adder) < len(orig):
50
+ # Fewer message bits than channels: leave the extra channels unchanged
51
+ for i in range(len(adder), len(orig)):
52
+ adder.append(orig[i])
53
+ final = list(map(inject, [(final[i], adder[i]) for i in range(len(adder))]))
54
+ return tuple(final)
55
+ else:
56
+ # Should never happen: more message bits than pixel channels
57
+ die(E.INTERNAL_ERROR,
58
+ "inject_pixel received more bits than the pixel has channels. "
59
+ "Please report this as a bug.")
60
+
61
+
62
+
63
+ def extract_pixel(orig, channel, bit_pos=0):
64
+ """Read one bit from a specific channel of a pixel.
65
+
66
+ Args:
67
+ orig (tuple | int): Pixel value (tuple for multi-channel, int for grayscale).
68
+ channel (int): Zero-based channel index to read from.
69
+ bit_pos (int): Which bit to read (0 = LSB).
70
+
71
+ Returns:
72
+ int: The requested bit (0 or 1).
73
+ """
74
+ try:
75
+ orig = list(orig)
76
+ except:
77
+ orig = [orig]
78
+
79
+ return (orig[channel] >> bit_pos) & 1
80
+
81
+
82
+
83
+ # Split a flat list/string into 8-element chunks (used to reassemble bytes from bits).
84
+ splitup = lambda arr: [arr[i:i+8] for i in range(0, len(arr), 8)]
85
+
86
+
87
+
88
+ def _bits_to_str(bits):
89
+ return "".join([str(b) for b in bits])
90
+
91
+
92
+ def _write_preamble_bit(container, k, bit, channels):
93
+ """Write a single bit to position k in the preamble region (always LSB, depth=1)."""
94
+ pixel_idx = k // channels
95
+ ch = k % channels
96
+ x, y = container.pixel_values[pixel_idx]
97
+ try:
98
+ p = list(container.pixels[x, y])
99
+ except:
100
+ p = [container.pixels[x, y]]
101
+ if bit:
102
+ p[ch] |= 1
103
+ else:
104
+ p[ch] &= ~1
105
+ # Mode "1": PIL normalises any non-zero value to 255 on storage.
106
+ # 255 & ~1 = 254 (non-zero) would be stored as 255, flipping a 0-bit back to 1.
107
+ # Explicitly write 0 or 255 so the LSB reads back correctly.
108
+ if container.img.mode == '1':
109
+ p[ch] = 255 if (p[ch] & 1) else 0
110
+ container.pixels[x, y] = tuple(p)
111
+
112
+
113
+ def _read_preamble_bit(container, k, channels):
114
+ """Read a single bit from position k in the preamble region (always LSB, depth=1)."""
115
+ pixel_idx = k // channels
116
+ ch = k % channels
117
+ x, y = container.pixel_values[pixel_idx]
118
+ return extract_pixel(container.pixels[x, y], ch, 0)
119
+
120
+
121
+ def _write_main_bit(container, k, bit, channels, depth, preamble_pixels):
122
+ """Write a single bit to position k in the main data region."""
123
+ pixel_idx = k // (channels * depth)
124
+ ch = (k // depth) % channels
125
+ bp = k % depth
126
+ x, y = container.pixel_values[preamble_pixels + pixel_idx]
127
+ mask = 1 << bp
128
+ try:
129
+ p = list(container.pixels[x, y])
130
+ except:
131
+ p = [container.pixels[x, y]]
132
+ if bit:
133
+ p[ch] |= mask
134
+ else:
135
+ p[ch] &= ~mask
136
+ # Mode "1": same normalisation as _write_preamble_bit — avoid 254→255 rounding.
137
+ if container.img.mode == '1':
138
+ p[ch] = 255 if (p[ch] & mask) else 0
139
+ container.pixels[x, y] = tuple(p)
140
+
141
+
142
+ def _read_main_bit(container, k, channels, depth, preamble_pixels):
143
+ """Read a single bit from position k in the main data region."""
144
+ pixel_idx = k // (channels * depth)
145
+ ch = (k // depth) % channels
146
+ bp = k % depth
147
+ x, y = container.pixel_values[preamble_pixels + pixel_idx]
148
+ return extract_pixel(container.pixels[x, y], ch, bp)
149
+
150
+
151
+ def read_preamble(container):
152
+ """Read the 12-bit preamble and return (depth, max_name_len).
153
+
154
+ The preamble is always stored at LSB (bit_pos=0), using the first
155
+ ceil(13 / channels) pixels in the shuffled order.
156
+
157
+ bits 0-4 : depth (5-bit uint, values 1-32)
158
+ bits 5-12 : max_name_len (8-bit uint, values 0-255)
159
+
160
+ Args:
161
+ container: A Container instance.
162
+
163
+ Returns:
164
+ tuple[int, int]: (depth, max_name_len)
165
+ """
166
+ channels = container.bits[container.img.mode]
167
+ bits = [_read_preamble_bit(container, k, channels) for k in range(13)]
168
+ depth = int(_bits_to_str(bits[0:5]), 2)
169
+ max_name_len = int(_bits_to_str(bits[5:13]), 2)
170
+ # Clamp depth to a valid range so a bad password doesn't cause division by zero
171
+ depth = max(1, min(32, depth))
172
+ return depth, max_name_len
173
+
174
+
175
+
176
+ def embed_files(container, channels, depth):
177
+ """Encode the preamble, header, filenames, and file data into the container's pixels.
178
+
179
+ Layout:
180
+ Preamble (first ceil(13/channels) pixels, depth=1):
181
+ 5 bits: depth
182
+ 8 bits: max_name_len
183
+
184
+ Main data region (remaining pixels, depth=N):
185
+ 512 bits : hash (64 ASCII chars)
186
+ 32 bits : num_files
187
+ per file:
188
+ 32 bits : file size in bytes
189
+ max_name_len*8 bits : filename bytes, UTF-8, zero-padded (omitted if max_name_len=0)
190
+ file data (concatenated)
191
+
192
+ Args:
193
+ container: A Container instance with .hash, .files, .filenames,
194
+ .max_name_len, and .pixel_values attributes.
195
+ channels (int): Number of channels in the image (bits per pixel at depth=1).
196
+ depth (int): Number of LSBs per channel to use (1–8).
197
+ """
198
+ preamble_pixels = ceil(13 / channels)
199
+
200
+ # Write 13-bit preamble (always depth=1, bit_pos=0)
201
+ preamble_bits = f"{depth:05b}{container.max_name_len:08b}"
202
+ for k in range(13):
203
+ _write_preamble_bit(container, k, int(preamble_bits[k]), channels)
204
+
205
+ # Build main message: hash + num_files + per-file (size [+ name]) + file data
206
+ header = "".join([f"{ord(c):08b}" for c in container.hash]) # 512 bits
207
+ header += f"{len(container.files):032b}"
208
+
209
+ for i, file_data in enumerate(container.files):
210
+ header += f"{len(file_data):032b}"
211
+ if container.max_name_len > 0:
212
+ name = container.filenames[i] if i < len(container.filenames) else ""
213
+ name_bytes = name.encode('utf-8')[:container.max_name_len]
214
+ name_bytes = name_bytes + b'\x00' * (container.max_name_len - len(name_bytes))
215
+ header += "".join([f"{b:08b}" for b in name_bytes])
216
+
217
+ files_bits = "".join(
218
+ ["".join([f"{b:08b}" for b in file_data]) for file_data in container.files]
219
+ )
220
+ message = header + files_bits
221
+
222
+ for k in range(len(message)):
223
+ _write_main_bit(container, k, int(message[k]), channels, depth, preamble_pixels)
224
+
225
+
226
+
227
+ def test_fit(container, channels, depth):
228
+ """Check whether the queued files will fit inside the image.
229
+
230
+ Args:
231
+ container: A Container instance.
232
+ channels (int): Number of channels in the image.
233
+ depth (int): Number of LSBs per channel to use.
234
+
235
+ Returns:
236
+ bool: True if the message fits, False otherwise.
237
+ """
238
+ preamble_pixels = ceil(13 / channels)
239
+ total_pixels = prod(container.size)
240
+ main_pixels = total_pixels - preamble_pixels
241
+ main_capacity = main_pixels * channels * depth
242
+
243
+ name_bits_per_file = container.max_name_len * 8
244
+ files_data_bits = sum([len(f) for f in container.files]) * 8
245
+ message_size = 512 + 32 + (32 + name_bits_per_file) * len(container.files) + files_data_bits
246
+
247
+ if message_size > main_capacity:
248
+ needed_bytes = (message_size + 7) // 8
249
+ available_bytes = main_capacity // 8
250
+ die(E.IMAGE_TOO_SMALL,
251
+ f"The files are too large to embed in this image at depth {depth}.\n"
252
+ f" Required : {needed_bytes:,} bytes\n"
253
+ f" Available: {available_bytes:,} bytes\n"
254
+ f" Try a larger image, increase the depth with -l, or embed fewer files.")
255
+ return True
256
+
257
+
258
+
259
+ def status(container):
260
+ """Print a diagnostic summary of the container's capacity and contents.
261
+
262
+ Args:
263
+ container: A Container instance (used after embed or extract).
264
+ """
265
+ channels = container.bits[container.img.mode]
266
+ depth = container.depth
267
+ preamble_pixels = ceil(13 / channels)
268
+ total_pixels = prod(container.size)
269
+ main_pixels = total_pixels - preamble_pixels
270
+ capacity = main_pixels * channels * depth
271
+
272
+ name_bits_per_file = container.max_name_len * 8
273
+ files_data_bits = sum([len(f) * 8 for f in container.files])
274
+ message_size = 512 + 32 + (32 + name_bits_per_file) * len(container.files) + files_data_bits
275
+
276
+ print(f"""
277
+ {container.filename}
278
+
279
+ Embedded files sizes: {[len(i) for i in container.files]}
280
+
281
+ ->{container.hash}
282
+ ->{container.rng}
283
+ ->{container.img.mode} mode
284
+ ->{container.size} pixels
285
+ ->{channels * depth} estimated bits/pixel (depth={depth})
286
+ ->{capacity} estimated bits to fit
287
+ ->{message_size} estimated message size
288
+ """)
289
+
290
+
291
+
292
+ def extract_hash(container, channels, depth, preamble_pixels):
293
+ """Read the 512-bit hash from the beginning of the embedded main data region.
294
+
295
+ Args:
296
+ container: A Container instance whose pixel_values have been shuffled.
297
+ channels (int): Number of channels in the image.
298
+ depth (int): Embedding depth read from the preamble.
299
+ preamble_pixels (int): Number of pixels reserved for the preamble.
300
+
301
+ Returns:
302
+ str: The 64-character hex hash string extracted from the image header.
303
+ """
304
+ raw_bits = [_read_main_bit(container, k, channels, depth, preamble_pixels)
305
+ for k in range(512)]
306
+ raw_str = _bits_to_str(raw_bits)
307
+ return "".join([chr(int(raw_str[j:j+8], 2)) for j in range(0, len(raw_str), 8)])
308
+
309
+
310
+
311
+ def extract_files(container, channels, depth, preamble_pixels, max_name_len):
312
+ """Read the file count, sizes, filenames, and raw file data from the embedded header.
313
+
314
+ Populates container.files and container.filenames.
315
+
316
+ Args:
317
+ container: A Container instance after a successful extract_hash call.
318
+ channels (int): Number of channels in the image.
319
+ depth (int): Embedding depth read from the preamble.
320
+ preamble_pixels (int): Number of pixels reserved for the preamble.
321
+ max_name_len (int): Filename field length in bytes (0 = no filename stored).
322
+ """
323
+ container.files = []
324
+ container.filenames = []
325
+
326
+ # Read num_files (bits 512-543)
327
+ bits = [_read_main_bit(container, k, channels, depth, preamble_pixels)
328
+ for k in range(512, 544)]
329
+ no_files = int(_bits_to_str(bits), 2)
330
+
331
+ offset = 544
332
+ file_sizes = []
333
+ file_names = []
334
+
335
+ for i in range(no_files):
336
+ # Read file size (32 bits)
337
+ size_bits = [_read_main_bit(container, k, channels, depth, preamble_pixels)
338
+ for k in range(offset, offset + 32)]
339
+ file_size = int(_bits_to_str(size_bits), 2)
340
+ file_sizes.append(file_size)
341
+ offset += 32
342
+
343
+ # Read filename field if present
344
+ if max_name_len > 0:
345
+ name_bits = [_read_main_bit(container, k, channels, depth, preamble_pixels)
346
+ for k in range(offset, offset + max_name_len * 8)]
347
+ name_bytes = bytes([int(_bits_to_str(name_bits[j:j+8]), 2)
348
+ for j in range(0, len(name_bits), 8)])
349
+ null = name_bytes.find(b'\x00')
350
+ name_bytes = name_bytes[:null] if null != -1 else name_bytes
351
+ try:
352
+ filename = Path(name_bytes.decode('utf-8')).name
353
+ if not filename:
354
+ filename = f"extracted_file_{i}"
355
+ except (UnicodeDecodeError, ValueError):
356
+ filename = f"extracted_file_{i}"
357
+ file_names.append(filename)
358
+ offset += max_name_len * 8
359
+ else:
360
+ file_names.append(f"extracted_file_{i}")
361
+
362
+ # Read file data
363
+ for i in range(no_files):
364
+ file_bits = [_read_main_bit(container, k, channels, depth, preamble_pixels)
365
+ for k in range(offset, offset + file_sizes[i] * 8)]
366
+ file_bytes = bytes([int(_bits_to_str(file_bits[j:j+8]), 2)
367
+ for j in range(0, len(file_bits), 8)])
368
+ container.files.append(bytearray(file_bytes))
369
+ container.filenames.append(file_names[i])
370
+ offset += file_sizes[i] * 8