Perception 0.8.2__tar.gz → 0.8.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (58) hide show
  1. {perception-0.8.2 → perception-0.8.3}/PKG-INFO +13 -13
  2. {perception-0.8.2 → perception-0.8.3}/perception/benchmarking/common.py +9 -3
  3. {perception-0.8.2 → perception-0.8.3}/perception/benchmarking/image.py +7 -4
  4. {perception-0.8.2 → perception-0.8.3}/perception/benchmarking/video.py +1 -1
  5. {perception-0.8.2 → perception-0.8.3}/perception/hashers/__init__.py +7 -0
  6. {perception-0.8.2 → perception-0.8.3}/perception/hashers/image/opencv.py +3 -3
  7. {perception-0.8.2 → perception-0.8.3}/perception/hashers/tools.py +189 -111
  8. {perception-0.8.2 → perception-0.8.3}/perception/hashers/video/tmk.py +7 -2
  9. {perception-0.8.2 → perception-0.8.3}/perception/testing/__init__.py +5 -3
  10. perception-0.8.3/perception/testing/videos/extra_channel_attached_pic.mp4 +0 -0
  11. perception-0.8.3/perception/testing/videos/extra_channel_attached_pic_audio.mp4 +0 -0
  12. {perception-0.8.2 → perception-0.8.3}/pyproject.toml +27 -36
  13. {perception-0.8.2 → perception-0.8.3}/setup.py +12 -12
  14. {perception-0.8.2 → perception-0.8.3}/LICENSE +0 -0
  15. {perception-0.8.2 → perception-0.8.3}/README.md +0 -0
  16. {perception-0.8.2 → perception-0.8.3}/build.py +0 -0
  17. {perception-0.8.2 → perception-0.8.3}/perception/__init__.py +0 -0
  18. {perception-0.8.2 → perception-0.8.3}/perception/approximate_deduplication/__init__.py +0 -0
  19. {perception-0.8.2 → perception-0.8.3}/perception/approximate_deduplication/debug.py +0 -0
  20. {perception-0.8.2 → perception-0.8.3}/perception/approximate_deduplication/index.py +0 -0
  21. {perception-0.8.2 → perception-0.8.3}/perception/approximate_deduplication/serve.py +0 -0
  22. {perception-0.8.2 → perception-0.8.3}/perception/benchmarking/__init__.py +0 -0
  23. {perception-0.8.2 → perception-0.8.3}/perception/benchmarking/extensions.pyx +0 -0
  24. {perception-0.8.2 → perception-0.8.3}/perception/benchmarking/image_transforms.py +0 -0
  25. {perception-0.8.2 → perception-0.8.3}/perception/benchmarking/video_transforms.py +0 -0
  26. {perception-0.8.2 → perception-0.8.3}/perception/extensions.pyx +0 -0
  27. {perception-0.8.2 → perception-0.8.3}/perception/hashers/hasher.py +0 -0
  28. {perception-0.8.2 → perception-0.8.3}/perception/hashers/image/__init__.py +0 -0
  29. {perception-0.8.2 → perception-0.8.3}/perception/hashers/image/average.py +0 -0
  30. {perception-0.8.2 → perception-0.8.3}/perception/hashers/image/dhash.py +0 -0
  31. {perception-0.8.2 → perception-0.8.3}/perception/hashers/image/pdq.py +0 -0
  32. {perception-0.8.2 → perception-0.8.3}/perception/hashers/image/phash.py +0 -0
  33. {perception-0.8.2 → perception-0.8.3}/perception/hashers/image/wavelet.py +0 -0
  34. {perception-0.8.2 → perception-0.8.3}/perception/hashers/video/__init__.py +0 -0
  35. {perception-0.8.2 → perception-0.8.3}/perception/hashers/video/framewise.py +0 -0
  36. {perception-0.8.2 → perception-0.8.3}/perception/local_descriptor_deduplication.py +0 -0
  37. {perception-0.8.2 → perception-0.8.3}/perception/py.typed +0 -0
  38. {perception-0.8.2 → perception-0.8.3}/perception/testing/images/README.md +0 -0
  39. {perception-0.8.2 → perception-0.8.3}/perception/testing/images/image1.jpg +0 -0
  40. {perception-0.8.2 → perception-0.8.3}/perception/testing/images/image10.jpg +0 -0
  41. {perception-0.8.2 → perception-0.8.3}/perception/testing/images/image2.jpg +0 -0
  42. {perception-0.8.2 → perception-0.8.3}/perception/testing/images/image3.jpg +0 -0
  43. {perception-0.8.2 → perception-0.8.3}/perception/testing/images/image4.jpg +0 -0
  44. {perception-0.8.2 → perception-0.8.3}/perception/testing/images/image5.jpg +0 -0
  45. {perception-0.8.2 → perception-0.8.3}/perception/testing/images/image6.jpg +0 -0
  46. {perception-0.8.2 → perception-0.8.3}/perception/testing/images/image7.jpg +0 -0
  47. {perception-0.8.2 → perception-0.8.3}/perception/testing/images/image8.jpg +0 -0
  48. {perception-0.8.2 → perception-0.8.3}/perception/testing/images/image9.jpg +0 -0
  49. {perception-0.8.2 → perception-0.8.3}/perception/testing/logos/README.md +0 -0
  50. {perception-0.8.2 → perception-0.8.3}/perception/testing/logos/logoipsum.png +0 -0
  51. {perception-0.8.2 → perception-0.8.3}/perception/testing/videos/README.md +0 -0
  52. {perception-0.8.2 → perception-0.8.3}/perception/testing/videos/expected_tmk.json.gz +0 -0
  53. {perception-0.8.2 → perception-0.8.3}/perception/testing/videos/rgb.m4v +0 -0
  54. {perception-0.8.2 → perception-0.8.3}/perception/testing/videos/v1.m4v +0 -0
  55. {perception-0.8.2 → perception-0.8.3}/perception/testing/videos/v2.m4v +0 -0
  56. {perception-0.8.2 → perception-0.8.3}/perception/testing/videos/v2s.mov +0 -0
  57. {perception-0.8.2 → perception-0.8.3}/perception/tools.py +0 -0
  58. {perception-0.8.2 → perception-0.8.3}/perception/utils.py +0 -0
@@ -1,13 +1,12 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: Perception
3
- Version: 0.8.2
3
+ Version: 0.8.3
4
4
  Summary: Perception provides flexible, well-documented, and comprehensively tested tooling for perceptual hashing research, development, and production use.
5
- License: Apache-2.0
5
+ License-Expression: Apache-2.0
6
6
  License-File: LICENSE
7
7
  Author: Thorn
8
8
  Author-email: info@wearethorn.org
9
9
  Requires-Python: >=3.10,<4.0
10
- Classifier: License :: OSI Approved :: Apache Software License
11
10
  Classifier: Programming Language :: Python :: 3
12
11
  Classifier: Programming Language :: Python :: 3.10
13
12
  Classifier: Programming Language :: Python :: 3.11
@@ -17,26 +16,27 @@ Classifier: Programming Language :: Python :: 3.14
17
16
  Provides-Extra: benchmarking
18
17
  Provides-Extra: experimental
19
18
  Provides-Extra: matching
20
- Requires-Dist: Cython (>=3,<4)
19
+ Provides-Extra: pdq
20
+ Requires-Dist: Cython (>=3.0.0,<4.0.0)
21
21
  Requires-Dist: Pillow
22
22
  Requires-Dist: aiohttp ; extra == "matching"
23
- Requires-Dist: faiss-cpu (>=1.8.0.post1,<2.0.0) ; extra == "experimental"
23
+ Requires-Dist: albumentations (>=2.0.8,<3.0.0) ; extra == "benchmarking"
24
+ Requires-Dist: faiss-cpu (>=1.8.0,<2.0.0) ; extra == "experimental"
24
25
  Requires-Dist: ffmpeg-python ; extra == "benchmarking"
25
- Requires-Dist: imgaug ; extra == "benchmarking"
26
26
  Requires-Dist: matplotlib ; extra == "benchmarking"
27
- Requires-Dist: networkit (>=11,<12) ; extra == "experimental"
28
- Requires-Dist: numpy (>=1.26,<2.0)
29
- Requires-Dist: opencv-contrib-python-headless (>=4.10,<5.0)
27
+ Requires-Dist: networkit (>=11.1,<12.0.0) ; extra == "experimental"
28
+ Requires-Dist: numpy (>=1.26.4,<3.0.0)
29
+ Requires-Dist: opencv-contrib-python-headless (>=4.10.0,<5.0.0)
30
30
  Requires-Dist: pandas
31
- Requires-Dist: pdqhash
31
+ Requires-Dist: pdqhash (>=0.2.7,<0.3.0) ; extra == "pdq"
32
32
  Requires-Dist: python-json-logger ; extra == "matching"
33
33
  Requires-Dist: pywavelets (>=1.5.0,<2.0.0)
34
34
  Requires-Dist: rich (>=13.7.0,<14.0.0)
35
35
  Requires-Dist: scikit-learn ; extra == "benchmarking"
36
- Requires-Dist: scipy ; extra == "benchmarking"
36
+ Requires-Dist: scipy
37
37
  Requires-Dist: tabulate ; extra == "benchmarking"
38
- Requires-Dist: tqdm
39
- Requires-Dist: validators (>=0.22,<1.0)
38
+ Requires-Dist: tqdm (>=4.67.1,<5.0.0)
39
+ Requires-Dist: validators (>=0.22.0,<1.0.0)
40
40
  Description-Content-Type: text/markdown
41
41
 
42
42
  # perception ![ci](https://github.com/thorn-oss/perception/workflows/ci/badge.svg)
@@ -366,7 +366,7 @@ class BenchmarkHashes(Filterable):
366
366
  )
367
367
  X_noop = np.array(
368
368
  noops.hash.apply(
369
- string_to_vector,
369
+ string_to_vector, # type: ignore[arg-type]
370
370
  dtype=dtype,
371
371
  hash_format="base64",
372
372
  hash_length=int(hash_length),
@@ -502,8 +502,11 @@ class BenchmarkHashes(Filterable):
502
502
  ax = axs[rowIdx if nrows > 1 else colIdx]
503
503
 
504
504
  # Plot the charts
505
+ inner_keys = ["guid"] + (
506
+ ["transform_name"] if "transform_name" in subset.columns else []
507
+ )
505
508
  pos, neg = (
506
- subset.groupby(["guid", "transform_name"])[
509
+ subset.groupby(inner_keys)[
507
510
  [
508
511
  "distance_to_closest_correct_image",
509
512
  "distance_to_closest_incorrect_image",
@@ -562,8 +565,11 @@ class BenchmarkHashes(Filterable):
562
565
  grouping = ["category", "transform_name"]
563
566
 
564
567
  def group_func(subset):
568
+ inner_keys = ["guid"] + (
569
+ ["transform_name"] if "transform_name" in subset.columns else []
570
+ )
565
571
  pos, neg = (
566
- subset.groupby(["guid", "transform_name"])[
572
+ subset.groupby(inner_keys)[
567
573
  [
568
574
  "distance_to_closest_correct_image",
569
575
  "distance_to_closest_incorrect_image",
@@ -4,7 +4,7 @@ import uuid
4
4
  import warnings
5
5
 
6
6
  import cv2
7
- import imgaug
7
+ import albumentations
8
8
  import pandas as pd
9
9
  from tqdm import tqdm
10
10
 
@@ -119,7 +119,7 @@ class BenchmarkImageDataset(BenchmarkDataset):
119
119
 
120
120
  def transform(
121
121
  self,
122
- transforms: dict[str, imgaug.augmenters.meta.Augmenter],
122
+ transforms: dict[str, albumentations.BasicTransform],
123
123
  storage_dir: str,
124
124
  errors: str = "raise",
125
125
  ) -> BenchmarkImageTransforms:
@@ -129,7 +129,7 @@ class BenchmarkImageDataset(BenchmarkDataset):
129
129
  transforms: A dictionary of transformations. The only required
130
130
  key is `noop` which determines how the original, untransformed
131
131
  image is saved. For a true copy, simply make the `noop` key
132
- `imgaug.augmenters.Noop()`.
132
+ `albumentations.NoOp`
133
133
  storage_dir: A directory to store all the images along with
134
134
  their transformed counterparts.
135
135
  errors: How to handle errors reading files. If "raise", exceptions are
@@ -145,7 +145,7 @@ class BenchmarkImageDataset(BenchmarkDataset):
145
145
  os.makedirs(storage_dir, exist_ok=True)
146
146
 
147
147
  files = self._df.copy()
148
- files["guid"] = [uuid.uuid4() for n in range(len(files))]
148
+ files["guid"] = [str(uuid.uuid4()) for n in range(len(files))]
149
149
 
150
150
  def apply_transform(files, transform_name):
151
151
  transform = transforms[transform_name]
@@ -166,6 +166,9 @@ class BenchmarkImageDataset(BenchmarkDataset):
166
166
  continue
167
167
  try:
168
168
  transformed = transform(image=image)
169
+ # If albumentations, output is a dict with 'image' key
170
+ if isinstance(transformed, dict) and "image" in transformed:
171
+ transformed = transformed["image"]
169
172
  except Exception as e:
170
173
  raise RuntimeError(
171
174
  f"An exception occurred while processing {filepath} "
@@ -94,7 +94,7 @@ class BenchmarkVideoDataset(BenchmarkDataset):
94
94
  os.makedirs(storage_dir, exist_ok=True)
95
95
 
96
96
  files = self._df.copy()
97
- files["guid"] = [uuid.uuid4() for n in range(len(files))]
97
+ files["guid"] = [str(uuid.uuid4()) for n in range(len(files))]
98
98
 
99
99
  def apply_transform_to_file(input_filepath, guid, transform_name, category):
100
100
  if input_filepath is None:
@@ -24,3 +24,10 @@ __all__ = [
24
24
  "PHashU8",
25
25
  "PHashF",
26
26
  ]
27
+
28
+ try:
29
+ from .image.pdq import PDQHash as PDQHash, PDQHashF as PDQHashF
30
+ except ImportError:
31
+ pass
32
+ else:
33
+ __all__.extend(["PDQHash", "PDQHashF"])
@@ -24,7 +24,7 @@ class MarrHildreth(OpenCVHasher):
24
24
 
25
25
  def __init__(self):
26
26
  super().__init__()
27
- self.hasher = cv2.img_hash.MarrHildrethHash.create()
27
+ self.hasher = cv2.img_hash.MarrHildrethHash.create() # type: ignore[attr-defined]
28
28
 
29
29
  def _compute(self, image):
30
30
  return np.unpackbits(self.hasher.compute(image)[0])
@@ -40,7 +40,7 @@ class ColorMoment(OpenCVHasher):
40
40
 
41
41
  def __init__(self):
42
42
  super().__init__()
43
- self.hasher = cv2.img_hash.ColorMomentHash.create()
43
+ self.hasher = cv2.img_hash.ColorMomentHash.create() # type: ignore[attr-defined]
44
44
 
45
45
  def _compute(self, image):
46
46
  return 10000 * self.hasher.compute(image)[0]
@@ -56,7 +56,7 @@ class BlockMean(OpenCVHasher):
56
56
 
57
57
  def __init__(self):
58
58
  super().__init__()
59
- self.hasher = cv2.img_hash.BlockMeanHash.create(1)
59
+ self.hasher = cv2.img_hash.BlockMeanHash.create(1) # type: ignore[attr-defined]
60
60
 
61
61
  def _compute(self, image):
62
62
  # https://stackoverflow.com/questions/54762896/why-cv2-norm-hamming-gives-different-value-than-actual-hamming-distance
@@ -11,6 +11,7 @@ import os
11
11
  import queue
12
12
  import shlex
13
13
  import subprocess
14
+ import tempfile
14
15
  import threading
15
16
  import typing
16
17
  import warnings
@@ -27,7 +28,9 @@ import validators
27
28
 
28
29
  LOGGER = logging.getLogger(__name__)
29
30
 
30
- ImageInputType = typing.Union[str, np.ndarray, "PIL.Image.Image", io.BytesIO]
31
+ ImageInputType = typing.Union[
32
+ str, np.ndarray, "PIL.Image.Image", io.BytesIO, tempfile.SpooledTemporaryFile
33
+ ]
31
34
 
32
35
  SIZES = {"float32": 32, "uint8": 8, "bool": 1}
33
36
 
@@ -357,7 +360,10 @@ def read(filepath_or_buffer: ImageInputType, timeout=None) -> np.ndarray:
357
360
  """
358
361
  if isinstance(filepath_or_buffer, PIL.Image.Image):
359
362
  return np.array(filepath_or_buffer.convert("RGB"))
360
- if isinstance(filepath_or_buffer, (io.BytesIO, client.HTTPResponse)):
363
+ if isinstance(
364
+ filepath_or_buffer,
365
+ (io.BytesIO, client.HTTPResponse, tempfile.SpooledTemporaryFile),
366
+ ):
361
367
  image = np.asarray(bytearray(filepath_or_buffer.read()), dtype=np.uint8)
362
368
  decoded_image = cv2.imdecode(image, cv2.IMREAD_UNCHANGED)
363
369
  elif isinstance(filepath_or_buffer, str):
@@ -561,9 +567,7 @@ def read_video_to_generator_ffmpeg(
561
567
  ), f"Invalid framerate: {frames_per_second}"
562
568
  seconds_per_frame = 1 / frames_per_second
563
569
  filters.append(
564
- f"fps={frames_per_second}:"
565
- f"round={frame_rounding}:"
566
- f"start_time={offset}"
570
+ f"fps={frames_per_second}:round={frame_rounding}:start_time={offset}"
567
571
  )
568
572
  # Add resizing filters.
569
573
  if use_cuda and codec_name in CUDA_CODECS:
@@ -601,7 +605,12 @@ def read_video_to_generator_ffmpeg(
601
605
  if not batch:
602
606
  break
603
607
  for image in np.frombuffer(batch, dtype="uint8").reshape(
604
- (-1, height, width, channels)
608
+ (
609
+ -1,
610
+ height,
611
+ width,
612
+ channels,
613
+ )
605
614
  ):
606
615
  if frames_per_second != "keyframes":
607
616
  yield (image, frame_index, timestamp)
@@ -960,137 +969,206 @@ def compute_synchronized_video_hashes(
960
969
 
961
970
 
962
971
  def unletterbox(
963
- image, only_remove_black: bool = False, min_fraction_meaningful_pixels: float = 0.1
972
+ image: np.ndarray,
973
+ only_remove_black: bool = False,
974
+ min_fraction_meaningful_pixels: float = 0.1,
975
+ color_threshold: float = 2,
976
+ min_side_length: int = 50,
977
+ min_reduction: float = 0.02,
964
978
  ) -> tuple[tuple[int, int], tuple[int, int]] | None:
965
- """Return bounds of non-trivial region of image or None.
966
-
967
- Unletterboxing is cropping an image such that trivial edge regions
968
- are removed. Trivial in this context means that the majority of
969
- the values in that row or column are zero or very close to
970
- zero.
971
-
972
- In order to do unletterboxing, this function returns bounds in the
973
- form (x1, x2), (y1, y2) where:
974
-
975
- - x1 is the index of the first column where over X% of the pixels
976
- have means (average of R, G, B) > 2.
977
- - x2 is the index of the last column where over X% of the pixels
978
- have means > 2.
979
- - y1 is the index of the first row where over X% of the pixels
980
- have means > 2.
981
- - y2 is the index of the last row where over X% of the pixels
982
- have means > 2.
983
- - X is min_fraction_meaningful_pixels 0.1 == 10%
984
-
985
- If there are zero columns or zero rows where over X% of the
986
- pixels have means > 2, this function returns `None`.
987
-
988
- Note that in the case(s) of a single column and/or row of
989
- non-trivial pixels that it is possible for x1 = x2 and/or y1 = y2.
990
-
991
- Consider these examples to understand edge cases. Given two
992
- images, `L` (entire left and bottom edges are 1, all other pixels
993
- 0) and `U` (left, bottom and right edges 1, all other pixels 0),
994
- `unletterbox(L)` would return the bounds of the single bottom-left
995
- pixel and `unletterbox(U)` would return the bounds of the entire
996
- bottom row.
997
-
998
- Consider `U1` which is the same as `U` but with the bottom two
999
- rows all 1s. `unletterbox(U1)` returns the bounds of the bottom
1000
- two rows.
979
+ """Return bounds of the non-trivial (content) region of an image, or None.
980
+
981
+ Letterboxing refers to uniform-color borders added around an image
982
+ (e.g., black bars on a video frame). This function detects such borders
983
+ by identifying the background color from the image corners and finding
984
+ the bounding box of pixels that differ from that background.
985
+
986
+ The function returns bounds as ``(x1, x2), (y1, y2)`` suitable for
987
+ slicing: ``image[y1:y2, x1:x2]``. The bounds are exclusive on the
988
+ right/bottom (i.e., x2 and y2 point one past the last content pixel).
989
+
990
+ **Algorithm overview:**
991
+
992
+ 1. Sample the four corner pixels and find the most common value as
993
+ the candidate background color. If all four corners differ, return
994
+ ``None`` (no consistent letterbox detected).
995
+ 2. Build a binary content mask where each pixel whose grayscale
996
+ intensity differs from the background by more than
997
+ ``color_threshold`` is marked as content.
998
+ 3. Project the mask onto rows and columns and find the first/last
999
+ row and column where the fraction of content pixels exceeds
1000
+ ``min_fraction_meaningful_pixels``.
1001
+ 4. Validate that the resulting crop is meaningfully smaller than the
1002
+ original (controlled by ``min_reduction``) and that both sides
1003
+ exceed ``min_side_length``.
1004
+
1005
+ Returns ``None`` when:
1006
+
1007
+ - No two corners share the same color (no clear background).
1008
+ - Every pixel differs from the detected background (no border).
1009
+ - No row or column meets the content-pixel threshold.
1010
+ - The crop would not remove at least ``min_reduction`` fraction
1011
+ from any dimension.
1012
+ - Either cropped dimension would be smaller than ``min_side_length``.
1001
1013
 
1002
1014
  Args:
1003
- image: The image from which to remove letterboxing.
1004
- only_remove_black: Set False to remove borders fo any color.
1005
- min_fraction_meaningful_pixels: 0 to 1: if cropped version is
1006
- smaller than this fraction of the image do not unletterbox.
1007
- 0.1 == 10% of the image.
1015
+ image: Input image as an ``np.ndarray``. May be grayscale (H×W)
1016
+ or RGB (H×W×3); RGB images are converted to grayscale
1017
+ internally for background detection.
1018
+ only_remove_black: If ``True``, treat black (intensity 0) as the
1019
+ background regardless of corner colors. If ``False`` (default),
1020
+ infer the background color from the most common corner value.
1021
+ min_fraction_meaningful_pixels: The minimum fraction (0–1) of
1022
+ pixels in a row or column that must differ from the background
1023
+ for that row/column to be considered part of the content region.
1024
+ Defaults to 0.1 (10%).
1025
+ color_threshold: The minimum absolute difference in grayscale
1026
+ intensity between a pixel and the background color for that
1027
+ pixel to be classified as content. Defaults to 2.
1028
+ min_side_length: The minimum width or height (in pixels) of the
1029
+ cropped region. If the crop would be smaller, ``None`` is
1030
+ returned. Defaults to 50.
1031
+ min_reduction: The minimum fraction (0–1) of the original width
1032
+ or height that must be removed for the crop to be worthwhile.
1033
+ If the crop removes less than this from both dimensions,
1034
+ ``None`` is returned. Defaults to 0.02 (2%).
1008
1035
 
1009
1036
  Returns:
1010
- A pair of coordinates bounds of the form (x1, x2)
1011
- and (y1, y2) representing the left, right, top, and
1012
- bottom bounds.
1013
-
1037
+ A tuple ``((x1, x2), (y1, y2))`` giving the left, right, top,
1038
+ and bottom bounds of the content region (right/bottom exclusive),
1039
+ or ``None`` if no meaningful letterbox was detected.
1014
1040
  """
1015
- assert 0 <= min_fraction_meaningful_pixels <= 1, "min_size must be between 0 and 1"
1016
- if not only_remove_black:
1017
- height, width, colors = image.shape
1018
-
1019
- bottom = height - 1
1020
- right = width - 1
1021
- top = 0
1022
- left = 0
1023
-
1024
- # Generate a count of the corner pixels.
1025
- counts = Counter(
1026
- [
1027
- tuple(image[top, left]),
1028
- tuple(image[top, right]),
1029
- tuple(image[bottom, left]),
1030
- tuple(image[bottom, right]),
1031
- ]
1041
+ if not 0 <= min_fraction_meaningful_pixels <= 1:
1042
+ raise ValueError("min_fraction_meaningful_pixels must be between 0 and 1")
1043
+ if not 0 <= min_reduction <= 1:
1044
+ raise ValueError("min_reduction must be between 0 and 1")
1045
+ image = image.astype(np.uint8)
1046
+
1047
+ shape = image.shape
1048
+ h, w = shape[0:2]
1049
+ if len(shape) == 3:
1050
+ image = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
1051
+
1052
+ # Determine background color and build binary content mask.
1053
+ if only_remove_black:
1054
+ bg_gray = 0
1055
+ else:
1056
+ # Sample the four corner pixels. If all four are unique there is no
1057
+ # consistent background color, so we bail out early (O(1) rejection).
1058
+ corners = (
1059
+ image[0, 0],
1060
+ image[0, w - 1],
1061
+ image[h - 1, 0],
1062
+ image[h - 1, w - 1],
1032
1063
  )
1033
- if len(counts) == 4:
1034
- return (0, image.shape[1]), (0, image.shape[0])
1035
-
1036
- # Grab reference color.
1037
- # We grab the most common shared color.
1038
- bg_color, _ = counts.most_common(1)[0]
1039
-
1040
- # Create an image of just that color. dtype to match image.
1041
- mask = np.ones((height, width, colors), dtype=np.int16)
1042
- mask[:, :] = np.array(bg_color)
1043
-
1044
- # Diff the image so that color is black.
1045
- image = np.abs(np.subtract(image, mask))
1046
-
1047
- # adj should be thought of as a boolean at each pixel indicating
1048
- # whether or not that pixel is non-trivial (True) or not (False).
1049
- adj = image.mean(axis=2) > 2
1050
1064
 
1051
- if adj.all():
1052
- return (0, image.shape[1] + 1), (0, image.shape[0] + 1)
1065
+ if len(set(corners)) == 4:
1066
+ LOGGER.debug("No common corner color detected, skipping content detection.")
1067
+ return (
1068
+ (0, w),
1069
+ (0, h),
1070
+ ) # Return full image bounds instead of None to maintain backwards compatibility
1071
+ # Use the most common corner value as the background intensity.
1072
+ counts = Counter(corners)
1073
+ bg_gray = counts.most_common(1)[0][0]
1074
+
1075
+ # Mark pixels whose grayscale intensity differs from the background
1076
+ # by more than color_threshold as content (True).
1077
+ content_mask = np.abs(image.astype(np.int16) - bg_gray) > color_threshold
1078
+
1079
+ # If every pixel is classified as content, there is no border to remove.
1080
+ if content_mask.all():
1081
+ LOGGER.debug("All pixels differ from background; no letterbox detected.")
1082
+ return (
1083
+ (0, w),
1084
+ (0, h),
1085
+ ) # Return full image bounds instead of None to maintain backwards compatibility
1086
+
1087
+ # Find the content bounding box by projecting the mask onto rows and
1088
+ # columns. cv2.reduce is used instead of np.sum for performance.
1089
+ mask_u8 = content_mask.astype(np.uint8)
1090
+ row_content = cv2.reduce(mask_u8, 1, cv2.REDUCE_SUM, dtype=cv2.CV_32S).ravel()
1091
+ col_content = cv2.reduce(mask_u8, 0, cv2.REDUCE_SUM, dtype=cv2.CV_32S).ravel()
1092
+
1093
+ # Thresholds for minimum content per row/column
1094
+ row_threshold = min_fraction_meaningful_pixels * w
1095
+ col_threshold = min_fraction_meaningful_pixels * h
1096
+
1097
+ # Find first/last rows and columns with sufficient content
1098
+ content_rows = np.where(row_content > row_threshold)[0]
1099
+ content_cols = np.where(col_content > col_threshold)[0]
1100
+
1101
+ if len(content_rows) == 0 or len(content_cols) == 0:
1102
+ LOGGER.debug("No rows or columns with sufficient content detected.")
1103
+ return None
1053
1104
 
1054
- # Find rows and cols with at least min_fraction_meaningful_pixels.
1055
- y = np.where(adj.sum(axis=1) > min_fraction_meaningful_pixels * image.shape[1])[0]
1056
- x = np.where(adj.sum(axis=0) > min_fraction_meaningful_pixels * image.shape[0])[0]
1105
+ top = int(content_rows[0])
1106
+ bottom = int(content_rows[-1]) + 1
1107
+ left = int(content_cols[0])
1108
+ right = int(content_cols[-1]) + 1
1109
+ height = bottom - top
1110
+ width = right - left
1057
1111
 
1058
- # Either no rows or no columns had enough meaningful information to keep.
1059
- if len(y) == 0 or len(x) == 0:
1112
+ # Reject if the crop does not remove at least min_reduction from
1113
+ # at least one dimension (i.e., the border is negligibly thin).
1114
+ if width >= w * (1 - min_reduction) and height >= h * (1 - min_reduction):
1115
+ LOGGER.debug(
1116
+ "Crop would not reduce either dimension by %.0f%%; skipping.",
1117
+ min_reduction * 100,
1118
+ )
1119
+ return (
1120
+ (0, w),
1121
+ (0, h),
1122
+ ) # Return full image bounds instead of None to maintain backwards compatibility
1123
+ # Reject if the remaining content region is too small to be useful.
1124
+ if width < min_side_length or height < min_side_length:
1125
+ LOGGER.debug(
1126
+ "Cropped region (%dx%d) smaller than min_side_length=%d; skipping.",
1127
+ width,
1128
+ height,
1129
+ min_side_length,
1130
+ )
1060
1131
  return None
1061
1132
 
1062
- if len(y) == 1:
1063
- y1 = y2 = y[0]
1064
- else:
1065
- y1, y2 = y[[0, -1]]
1066
- if len(x) == 1:
1067
- x1 = x2 = x[0]
1068
- else:
1069
- x1, x2 = x[[0, -1]]
1070
- bounds = (x1, x2 + 1), (y1, y2 + 1)
1071
-
1072
- return bounds
1133
+ return ((left, right), (top, bottom))
1073
1134
 
1074
1135
 
1075
1136
  def unletterbox_crop(
1076
- image: np.ndarray, min_fraction_meaningful_pixels: float = 0.1
1137
+ image: np.ndarray,
1138
+ min_fraction_meaningful_pixels: float = 0.1,
1139
+ color_threshold: float = 2,
1140
+ min_side_length: int = 50,
1141
+ min_reduction: float = 0.02,
1077
1142
  ) -> np.ndarray | None:
1078
1143
  """Detect and crop the letterboxed regions from an image.
1079
1144
 
1080
1145
  Args:
1081
1146
  image: The image from which to remove letterboxing.
1082
1147
  min_fraction_meaningful_pixels: 0 to 1: if cropped version is
1083
- smaller than this fraction of the image do not unletterbox.
1084
- 0.1 == 10% of the image.
1148
+ smaller than this fraction of the image do not unletterbox.
1149
+ 0.1 == 10% of the image.
1150
+ color_threshold: The minimum absolute difference in grayscale
1151
+ intensity between a pixel and the background color for that
1152
+ pixel to be classified as content. Defaults to 2.
1153
+ min_side_length: The minimum width or height (in pixels) of the
1154
+ cropped region. If the crop would be smaller, ``None`` is
1155
+ returned. Defaults to 50.
1156
+ min_reduction: The minimum fraction (0–1) of the original width
1157
+ or height that must be removed for the crop to be worthwhile.
1158
+ If the crop removes less than this from both dimensions,
1159
+ the original image is returned. Defaults to 0.02 (2%).
1085
1160
  Returns:
1086
1161
  The cropped image or None if the image is mostly blank space.
1087
1162
  """
1088
- assert isinstance(
1089
- image, np.ndarray
1090
- ), "Please send np.ndarray to unletterbox_image()."
1163
+ if not isinstance(image, np.ndarray):
1164
+ raise TypeError(f"Expected np.ndarray, got {type(image).__name__}")
1091
1165
 
1092
1166
  bounds = unletterbox(
1093
- image, min_fraction_meaningful_pixels=min_fraction_meaningful_pixels
1167
+ image,
1168
+ min_fraction_meaningful_pixels=min_fraction_meaningful_pixels,
1169
+ color_threshold=color_threshold,
1170
+ min_side_length=min_side_length,
1171
+ min_reduction=min_reduction,
1094
1172
  )
1095
1173
  if bounds is None:
1096
1174
  return None
@@ -56,7 +56,7 @@ class TMKL2(VideoHasher):
56
56
  for i in range(1, self.m)
57
57
  ]
58
58
  )
59
- a = a.reshape(1, -1).repeat(repeats=len(self.T), axis=0)
59
+ a = a.reshape(1, -1).repeat(repeats=len(self.T), axis=0) # type: ignore
60
60
  a = np.sqrt(a)
61
61
  self.a = a[..., np.newaxis]
62
62
 
@@ -77,7 +77,12 @@ class TMKL2(VideoHasher):
77
77
  def hash_from_final_state(self, state):
78
78
  timestamps = np.array(state["timestamps"])
79
79
  features = np.array(state["features"]).reshape(
80
- (1, 1, timestamps.shape[0], self.frame_hasher.hash_length)
80
+ (
81
+ 1,
82
+ 1,
83
+ timestamps.shape[0],
84
+ self.frame_hasher.hash_length,
85
+ )
81
86
  )
82
87
  x = self.ms_normed * timestamps
83
88
  yw1 = np.sin(x) * self.a
@@ -101,8 +101,8 @@ def hash_dicts_to_df(hash_dicts, returns_multiple):
101
101
  ),
102
102
  "hash": tools.flatten([h["hash"] for h in hash_dicts]),
103
103
  }
104
- ).assign(error=None)
105
- return pd.DataFrame.from_records(hash_dicts).assign(error=None)
104
+ ).assign(error=np.nan)
105
+ return pd.DataFrame.from_records(hash_dicts).assign(error=np.nan)
106
106
 
107
107
 
108
108
  def test_hasher_parallelization(hasher, test_filepaths):
@@ -156,7 +156,9 @@ def test_image_hasher_integrity(
156
156
  image2 = test_images[1]
157
157
  hash1_1 = str(hasher.compute(image1)) # str() games for mypy, not proud
158
158
  hash1_2 = str(hasher.compute(Image.open(image1)))
159
- hash1_3 = str(hasher.compute(cv2.cvtColor(cv2.imread(image1), cv2.COLOR_BGR2RGB)))
159
+ image_cv = cv2.imread(image1)
160
+ assert image_cv is not None, f"Failed to load image: {image1}"
161
+ hash1_3 = str(hasher.compute(cv2.cvtColor(image_cv, cv2.COLOR_BGR2RGB)))
160
162
 
161
163
  hash2_1 = str(hasher.compute(image2))
162
164
 
@@ -1,51 +1,40 @@
1
- [tool.poetry]
1
+ [project]
2
2
  name = "Perception"
3
- version = "0.8.2"
3
+ dynamic = []
4
4
  description = "Perception provides flexible, well-documented, and comprehensively tested tooling for perceptual hashing research, development, and production use."
5
- authors = ["Thorn <info@wearethorn.org>"]
6
- license = "Apache License 2.0"
5
+ authors = [{ name = "Thorn", email = "info@wearethorn.org" }]
6
+ license = "Apache-2.0"
7
7
  readme = "README.md"
8
+ requires-python = ">=3.10,<4.0"
9
+ dependencies = [
10
+ "Cython>=3.0.0,<4.0.0",
11
+ "numpy>=1.26.4,<3.0.0",
12
+ "opencv-contrib-python-headless>=4.10.0,<5.0.0",
13
+ "pandas",
14
+ "Pillow",
15
+ "pywavelets>=1.5.0,<2.0.0",
16
+ "validators>=0.22.0,<1.0.0",
17
+ "rich>=13.7.0,<14.0.0",
18
+ "scipy",
19
+ "tqdm>=4.67.1,<5.0.0",
20
+ ]
21
+ version = "0.8.3"
8
22
 
9
- [tool.poetry.dependencies]
10
- python = "^3.10"
11
- Cython = "^3"
12
- numpy = "^1.26"
13
- opencv-contrib-python-headless = "^4.10"
14
- pandas = "*"
15
- pdqhash = "*"
16
- Pillow = "*"
17
- pywavelets = "^1.5.0"
18
- tqdm = "*"
19
- validators = ">=0.22, <1.0"
20
- scipy = "*"
21
-
22
- # Benchmarking Extras
23
- matplotlib = { version = "*", optional = true }
24
- imgaug = { version = "*", optional = true }
25
- tabulate = { version = "*", optional = true }
26
- scikit-learn = { version = "*", optional = true }
27
- ffmpeg-python = { version = "*", optional = true }
28
-
29
- # Matching Extras
30
- aiohttp = { version = "*", optional = true }
31
- python-json-logger = { version = "*", optional = true }
32
- rich = "^13.7.0"
33
-
34
- # Experimental Extras
35
- networkit = { version = "^11", optional = true }
36
- faiss-cpu = { version = "^1.8.0.post1", optional = true }
37
23
 
38
- [tool.poetry.extras]
24
+ [project.optional-dependencies]
39
25
  benchmarking = [
40
26
  "matplotlib",
41
- "scipy",
42
- "imgaug",
27
+ "albumentations>=2.0.8,<3.0.0",
43
28
  "tabulate",
44
29
  "scikit-learn",
45
30
  "ffmpeg-python",
46
31
  ]
47
32
  matching = ["aiohttp", "python-json-logger"]
48
- experimental = ["networkit", "faiss-cpu"]
33
+ experimental = ["networkit>=11.1,<12.0.0", "faiss-cpu>=1.8.0,<2.0.0"]
34
+ pdq = ["pdqhash>=0.2.7,<0.3.0"]
35
+
36
+
37
+ [tool.poetry]
49
38
 
50
39
 
51
40
  [tool.poetry.group.dev.dependencies]
@@ -61,6 +50,7 @@ ruff = "*"
61
50
  types-pillow = "*"
62
51
  types-tqdm = "*"
63
52
  twine = "*"
53
+ albumentations = "^2.0.8"
64
54
 
65
55
 
66
56
  [tool.poetry.build]
@@ -74,6 +64,7 @@ ignore_missing_imports = true
74
64
 
75
65
  [tool.poetry-dynamic-versioning]
76
66
  enable = false
67
+ vcs = "git"
77
68
 
78
69
  [build-system]
79
70
  requires = [
@@ -14,30 +14,30 @@ package_data = \
14
14
  {'': ['*'], 'perception.testing': ['images/*', 'logos/*', 'videos/*']}
15
15
 
16
16
  install_requires = \
17
- ['Cython>=3,<4',
17
+ ['Cython>=3.0.0,<4.0.0',
18
18
  'Pillow',
19
- 'numpy>=1.26,<2.0',
20
- 'opencv-contrib-python-headless>=4.10,<5.0',
19
+ 'numpy>=1.26.4,<3.0.0',
20
+ 'opencv-contrib-python-headless>=4.10.0,<5.0.0',
21
21
  'pandas',
22
- 'pdqhash',
23
22
  'pywavelets>=1.5.0,<2.0.0',
24
23
  'rich>=13.7.0,<14.0.0',
25
- 'tqdm',
26
- 'validators>=0.22,<1.0']
24
+ 'scipy',
25
+ 'tqdm>=4.67.1,<5.0.0',
26
+ 'validators>=0.22.0,<1.0.0']
27
27
 
28
28
  extras_require = \
29
- {':extra == "benchmarking"': ['scipy'],
30
- 'benchmarking': ['matplotlib',
31
- 'imgaug',
29
+ {'benchmarking': ['matplotlib',
30
+ 'albumentations>=2.0.8,<3.0.0',
32
31
  'tabulate',
33
32
  'scikit-learn',
34
33
  'ffmpeg-python'],
35
- 'experimental': ['networkit>=11,<12', 'faiss-cpu>=1.8.0.post1,<2.0.0'],
36
- 'matching': ['aiohttp', 'python-json-logger']}
34
+ 'experimental': ['networkit>=11.1,<12.0.0', 'faiss-cpu>=1.8.0,<2.0.0'],
35
+ 'matching': ['aiohttp', 'python-json-logger'],
36
+ 'pdq': ['pdqhash>=0.2.7,<0.3.0']}
37
37
 
38
38
  setup_kwargs = {
39
39
  'name': 'Perception',
40
- 'version': '0.8.2',
40
+ 'version': '0.8.3',
41
41
  'description': 'Perception provides flexible, well-documented, and comprehensively tested tooling for perceptual hashing research, development, and production use.',
42
42
  'long_description': "# perception ![ci](https://github.com/thorn-oss/perception/workflows/ci/badge.svg)\n\n`perception` provides flexible, well-documented, and comprehensively tested tooling for perceptual hashing research, development, and production use. See [the documentation](https://perception.thorn.engineering/en/latest/) for details.\n\n## Background\n\n`perception` was initially developed at [Thorn](https://www.thorn.org) as part of our work to eliminate child sexual abuse material from the internet. For more information on the issue, check out [our CEO's TED talk](https://www.thorn.org/blog/time-is-now-eliminate-csam/).\n\n## Getting Started\n\n### Installation\n\n`pip install perception`\n\n### Hashing\n\nHashing with different functions is simple with `perception`.\n\n```python\nfrom perception import hashers\n\nfile1, file2 = 'test1.jpg', 'test2.jpg'\nhasher = hashers.PHash()\nhash1, hash2 = hasher.compute(file1), hasher.compute(file2)\ndistance = hasher.compute_distance(hash1, hash2)\n```\n\n### Examples\n\nSee below for end-to-end examples for common use cases for perceptual hashes.\n\n- [Detecting child sexual abuse material](https://perception.thorn.engineering/en/latest/examples/detecting_csam.html)\n- [Deduplicating media](https://perception.thorn.engineering/en/latest/examples/deduplication.html)\n- [Benchmarking perceptual hashes](https://perception.thorn.engineering/en/latest/examples/benchmarking.html)\n\n## Supported Hashing Algorithms\n\n`perception` currently ships with:\n\n- pHash (DCT hash) (`perception.hashers.PHash`)\n- Facebook's PDQ Hash (`perception.hashers.PDQ`)\n- dHash (difference hash) (`perception.hashers.DHash`)\n- aHash (average hash) (`perception.hashers.AverageHash`)\n- Marr-Hildreth (`perception.hashers.MarrHildreth`)\n- Color Moment (`perception.hashers.ColorMoment`)\n- Block Mean (`perception.hashers.BlockMean`)\n- wHash (wavelet hash) (`perception.hashers.WaveletHash`)\n\n## Contributing\n\nTo work on the project, start by doing the following.\n\n```bash\n# Install local dependencies for\n# code completion, etc.\nmake init\n\n- To do a (close to) comprehensive check before committing code, you can use `make precommit`.\n\nTo implement new features, please first file an issue proposing your change for discussion.\n\nTo report problems, please file an issue with sample code, expected results, actual results, and a complete traceback.\n\n## Alternatives\n\nThere are other packages worth checking out to see if they meet your needs for perceptual hashing. Here are some\nexamples.\n\n- [dedupe](https://github.com/dedupeio/dedupe)\n- [imagededup](https://idealo.github.io/imagededup/)\n- [ImageHash](https://github.com/JohannesBuchner/imagehash)\n- [PhotoHash](https://github.com/bunchesofdonald/photohash)\n```\n",
43
43
  'author': 'Thorn',
File without changes
File without changes
File without changes