PyPI - Perception - Versions diffs - 0.8.2__tar.gz → 0.8.4__tar.gz - Mend

Perception 0.8.2tar.gz → 0.8.4tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (59) hide show

{perception-0.8.2 → perception-0.8.4}/PKG-INFO RENAMED Viewed

@@ -1,13 +1,12 @@
 Metadata-Version: 2.4
 Name: Perception
-Version: 0.8.2
+Version: 0.8.4
 Summary: Perception provides flexible, well-documented, and comprehensively tested tooling for perceptual hashing research, development, and production use.
-License: Apache-2.0
+License-Expression: Apache-2.0
 License-File: LICENSE
 Author: Thorn
 Author-email: info@wearethorn.org
 Requires-Python: >=3.10,<4.0
-Classifier: License :: OSI Approved :: Apache Software License
 Classifier: Programming Language :: Python :: 3
 Classifier: Programming Language :: Python :: 3.10
 Classifier: Programming Language :: Python :: 3.11
@@ -15,28 +14,29 @@ Classifier: Programming Language :: Python :: 3.12
 Classifier: Programming Language :: Python :: 3.13
 Classifier: Programming Language :: Python :: 3.14
 Provides-Extra: benchmarking
-Provides-Extra: experimental
 Provides-Extra: matching
-Requires-Dist: Cython (>=3,<4)
+Provides-Extra: pdq
+Requires-Dist: Cython (>=3.0.0,<4.0.0)
 Requires-Dist: Pillow
 Requires-Dist: aiohttp ; extra == "matching"
-Requires-Dist: faiss-cpu (>=1.8.0.post1,<2.0.0) ; extra == "experimental"
+Requires-Dist: albumentations (>=2.0.8,<3.0.0) ; extra == "benchmarking"
+Requires-Dist: faiss-cpu (>=1.8.0,<2.0.0)
 Requires-Dist: ffmpeg-python ; extra == "benchmarking"
-Requires-Dist: imgaug ; extra == "benchmarking"
 Requires-Dist: matplotlib ; extra == "benchmarking"
-Requires-Dist: networkit (>=11,<12) ; extra == "experimental"
-Requires-Dist: numpy (>=1.26,<2.0)
-Requires-Dist: opencv-contrib-python-headless (>=4.10,<5.0)
+Requires-Dist: networkit (>=11.1,<12.0.0) ; sys_platform != "darwin"
+Requires-Dist: networkx (>=3.0,<4.0) ; sys_platform == "darwin"
+Requires-Dist: numpy (>=1.26.4,<3.0.0)
+Requires-Dist: opencv-contrib-python-headless (>=4.10.0,<5.0.0)
 Requires-Dist: pandas
-Requires-Dist: pdqhash
+Requires-Dist: pdqhash (>=0.2.7,<0.3.0) ; extra == "pdq"
 Requires-Dist: python-json-logger ; extra == "matching"
 Requires-Dist: pywavelets (>=1.5.0,<2.0.0)
 Requires-Dist: rich (>=13.7.0,<14.0.0)
 Requires-Dist: scikit-learn ; extra == "benchmarking"
-Requires-Dist: scipy ; extra == "benchmarking"
+Requires-Dist: scipy
 Requires-Dist: tabulate ; extra == "benchmarking"
-Requires-Dist: tqdm
-Requires-Dist: validators (>=0.22,<1.0)
+Requires-Dist: tqdm (>=4.67.1,<5.0.0)
+Requires-Dist: validators (>=0.22.0,<1.0.0)
 Description-Content-Type: text/markdown
 # perception ![ci](https://github.com/thorn-oss/perception/workflows/ci/badge.svg)

{perception-0.8.2 → perception-0.8.4}/build.py RENAMED Viewed

@@ -1,7 +1,6 @@
 from Cython.Build import cythonize
 import numpy as np
 compiler_directives = {"language_level": 3, "embedsignature": True}

{perception-0.8.2 → perception-0.8.4}/perception/approximate_deduplication/__init__.py RENAMED Viewed

@@ -4,11 +4,12 @@ import os.path as op
 import typing
 import faiss
-import networkit as nk
 import numpy as np
 import tqdm
 import typing_extensions
+from ._graph_backend import get_graph_backend
 LOGGER = logging.getLogger(__name__)
 DEFAULT_PCT_PROBE = 0
@@ -227,16 +228,13 @@ def pairs_to_clusters(
     node_to_id_map = {v: k for k, v in id_to_node_map.items()}
     LOGGER.debug("Building graph.")
-    graph = nk.Graph(len(list_ids))
     node_pairs = {(id_to_node_map[pair[0]], id_to_node_map[pair[1]]) for pair in pairs}
-    for node_pair in node_pairs:
-        graph.addEdge(node_pair[0], node_pair[1])
+    backend = get_graph_backend()
+    graph = backend.build_graph(len(list_ids), node_pairs)
     assignments: list[ClusterAssignment] = []
     cluster_index = 0
-    cc_query = nk.components.ConnectedComponents(graph)
-    cc_query.run()
-    components = cc_query.getComponents()
+    components = backend.connected_components(graph)
     for component in components:
         LOGGER.debug("Got component with size: %s", len(component))
@@ -246,19 +244,9 @@ def pairs_to_clusters(
             )
             cluster_index += 1
             continue
-        # Map between node values for a connected component
-        component_node_map = dict(enumerate(component))
-        cc_sub_graph = nk.graphtools.subgraphFromNodes(graph, component, compact=True)
-        algo = nk.community.PLP(cc_sub_graph)
-        algo.run()
-        communities = algo.getPartition()
-        community_map = communities.subsetSizeMap()
-        for community, size in community_map.items():
-            LOGGER.debug("Got community with size: %s", size)
-            community_members = list(
-                communities.getMembers(community)
-            )  # Need to do this to do batching.
-            community_members = [component_node_map[i] for i in community_members]
+        communities = backend.communities(graph, component)
+        for community_members in communities:
+            LOGGER.debug("Got community with size: %s", len(community_members))
             if strictness == "community":
                 assignments.extend(
                     [
@@ -269,33 +257,20 @@ def pairs_to_clusters(
                 cluster_index += 1
                 continue
-            for start in range(0, len(community_members), max_clique_batch_size):
-                community_nodes = community_members[
-                    start : start + max_clique_batch_size
-                ]
-                LOGGER.debug("Creating subgraph with %s nodes.", len(community_nodes))
-                # Map between node values for a community
-                community_node_map = dict(enumerate(community_nodes))
-                subgraph = nk.graphtools.subgraphFromNodes(
-                    graph, community_nodes, compact=True
+            for clique_members in backend.maximal_cliques(
+                graph,
+                community_members,
+                max_clique_batch_size=max_clique_batch_size,
+            ):
+                assignments.extend(
+                    [
+                        {
+                            "id": node_to_id_map[n],
+                            "cluster": cluster_index,
+                        }
+                        for n in clique_members
+                    ]
                 )
-                while subgraph.numberOfNodes() > 0:
-                    LOGGER.debug("Subgraph size: %s", subgraph.numberOfNodes())
-                    clique = nk.clique.MaximalCliques(subgraph, maximumOnly=True)
-                    clique.run()
-                    clique_members = clique.getCliques()[0]
-                    assignments.extend(
-                        [
-                            {
-                                "id": node_to_id_map[community_node_map[n]],
-                                "cluster": cluster_index,
-                            }
-                            for n in clique_members
-                        ]
-                    )
-                    cluster_index += 1
-                    for n in clique_members:
-                        subgraph.removeNode(n)
+                cluster_index += 1
     return assignments

perception-0.8.4/perception/approximate_deduplication/_graph_backend.py ADDED Viewed

@@ -0,0 +1,138 @@
+import sys
+import typing
+from abc import ABC, abstractmethod
+class GraphBackend(ABC):
+    @abstractmethod
+    def build_graph(
+        self, node_count: int, edges: typing.Iterable[tuple[int, int]]
+    ) -> typing.Any: ...
+    @abstractmethod
+    def connected_components(self, graph: typing.Any) -> list[list[int]]: ...
+    @abstractmethod
+    def communities(
+        self, graph: typing.Any, component: list[int]
+    ) -> list[list[int]]: ...
+    @abstractmethod
+    def maximal_cliques(
+        self,
+        graph: typing.Any,
+        community_nodes: list[int],
+        max_clique_batch_size: int,
+    ) -> list[list[int]]: ...
+class NetworkitGraphBackend(GraphBackend):
+    def __init__(self):
+        import networkit as nk
+        self.nk = nk
+    def build_graph(
+        self, node_count: int, edges: typing.Iterable[tuple[int, int]]
+    ) -> typing.Any:
+        graph = self.nk.Graph(node_count)
+        for start, end in edges:
+            graph.addEdge(start, end)
+        return graph
+    def connected_components(self, graph: typing.Any) -> list[list[int]]:
+        cc_query = self.nk.components.ConnectedComponents(graph)
+        cc_query.run()
+        return cc_query.getComponents()
+    def communities(self, graph: typing.Any, component: list[int]) -> list[list[int]]:
+        component_node_map = dict(enumerate(component))
+        subgraph = self.nk.graphtools.subgraphFromNodes(graph, component, compact=True)
+        algo = self.nk.community.PLP(subgraph, maxIterations=32)
+        algo.run()
+        communities = algo.getPartition()
+        return [
+            [component_node_map[node] for node in communities.getMembers(community)]
+            for community in communities.subsetSizeMap().keys()
+        ]
+    def maximal_cliques(
+        self,
+        graph: typing.Any,
+        community_nodes: list[int],
+        max_clique_batch_size: int,
+    ) -> list[list[int]]:
+        cliques: list[list[int]] = []
+        for start in range(0, len(community_nodes), max_clique_batch_size):
+            batch_nodes = community_nodes[start : start + max_clique_batch_size]
+            community_node_map = dict(enumerate(batch_nodes))
+            subgraph = self.nk.graphtools.subgraphFromNodes(
+                graph, batch_nodes, compact=True
+            )
+            while subgraph.numberOfNodes() > 0:
+                clique = self.nk.clique.MaximalCliques(subgraph, maximumOnly=True)
+                clique.run()
+                clique_members = clique.getCliques()[0]
+                cliques.append([community_node_map[node] for node in clique_members])
+                for node in clique_members:
+                    subgraph.removeNode(node)
+        return cliques
+class NetworkxGraphBackend(GraphBackend):
+    def __init__(self):
+        import networkx as nx
+        self.nx = nx
+    def build_graph(
+        self, node_count: int, edges: typing.Iterable[tuple[int, int]]
+    ) -> typing.Any:
+        graph = self.nx.Graph()
+        graph.add_nodes_from(range(node_count))
+        graph.add_edges_from(edges)
+        return graph
+    def connected_components(self, graph: typing.Any) -> list[list[int]]:
+        return [list(component) for component in self.nx.connected_components(graph)]
+    def communities(self, graph: typing.Any, component: list[int]) -> list[list[int]]:
+        subgraph = graph.subgraph(component)
+        return [
+            list(community)
+            for community in self.nx.algorithms.community.asyn_lpa_communities(
+                subgraph, seed=0
+            )
+        ]
+    def maximal_cliques(
+        self,
+        graph: typing.Any,
+        community_nodes: list[int],
+        max_clique_batch_size: int,
+    ) -> list[list[int]]:
+        cliques: list[list[int]] = []
+        for start in range(0, len(community_nodes), max_clique_batch_size):
+            batch_nodes = community_nodes[start : start + max_clique_batch_size]
+            subgraph = graph.subgraph(batch_nodes).copy()
+            while subgraph.number_of_nodes() > 0:
+                clique_members = max(
+                    self.nx.find_cliques(subgraph),
+                    key=lambda clique: (
+                        len(clique),
+                        tuple(sorted(clique)),
+                    ),
+                )
+                cliques.append(list(clique_members))
+                subgraph.remove_nodes_from(clique_members)
+        return cliques
+def get_graph_backend() -> GraphBackend:
+    if sys.platform == "darwin":
+        return NetworkxGraphBackend()
+    return NetworkitGraphBackend()

{perception-0.8.2 → perception-0.8.4}/perception/approximate_deduplication/debug.py RENAMED Viewed

@@ -79,10 +79,8 @@ def vizualize_pair(
             circle_size=circle_size,
         )
     else:
-        LOGGER.warning(
-            """No match_metadata provided, recalculating match points,
-            won't match perception match points."""
-        )
+        LOGGER.warning("""No match_metadata provided, recalculating match points,
+            won't match perception match points.""")
         img_matched = viz_brute_force(features_1, features_2, img1, img2, ratio=ratio)
     return img_matched

{perception-0.8.2 → perception-0.8.4}/perception/benchmarking/common.py RENAMED Viewed

@@ -366,7 +366,7 @@ class BenchmarkHashes(Filterable):
                 )
                 X_noop = np.array(
                     noops.hash.apply(
-                        string_to_vector,
+                        string_to_vector,  # type: ignore[arg-type]
                         dtype=dtype,
                         hash_format="base64",
                         hash_length=int(hash_length),
@@ -502,8 +502,11 @@ class BenchmarkHashes(Filterable):
                 ax = axs[rowIdx if nrows > 1 else colIdx]
             # Plot the charts
+            inner_keys = ["guid"] + (
+                ["transform_name"] if "transform_name" in subset.columns else []
+            )
             pos, neg = (
-                subset.groupby(["guid", "transform_name"])[
+                subset.groupby(inner_keys)[
                     [
                         "distance_to_closest_correct_image",
                         "distance_to_closest_incorrect_image",
@@ -562,8 +565,11 @@ class BenchmarkHashes(Filterable):
             grouping = ["category", "transform_name"]
         def group_func(subset):
+            inner_keys = ["guid"] + (
+                ["transform_name"] if "transform_name" in subset.columns else []
+            )
             pos, neg = (
-                subset.groupby(["guid", "transform_name"])[
+                subset.groupby(inner_keys)[
                     [
                         "distance_to_closest_correct_image",
                         "distance_to_closest_incorrect_image",

{perception-0.8.2 → perception-0.8.4}/perception/benchmarking/image.py RENAMED Viewed

@@ -4,7 +4,7 @@ import uuid
 import warnings
 import cv2
-import imgaug
+import albumentations
 import pandas as pd
 from tqdm import tqdm
@@ -119,7 +119,7 @@ class BenchmarkImageDataset(BenchmarkDataset):
     def transform(
         self,
-        transforms: dict[str, imgaug.augmenters.meta.Augmenter],
+        transforms: dict[str, albumentations.BasicTransform],
         storage_dir: str,
         errors: str = "raise",
     ) -> BenchmarkImageTransforms:
@@ -129,7 +129,7 @@ class BenchmarkImageDataset(BenchmarkDataset):
             transforms: A dictionary of transformations. The only required
                 key is `noop` which determines how the original, untransformed
                 image is saved. For a true copy, simply make the `noop` key
-                `imgaug.augmenters.Noop()`.
+                `albumentations.NoOp`
             storage_dir: A directory to store all the images along with
                 their transformed counterparts.
             errors: How to handle errors reading files. If "raise", exceptions are
@@ -145,7 +145,7 @@ class BenchmarkImageDataset(BenchmarkDataset):
         os.makedirs(storage_dir, exist_ok=True)
         files = self._df.copy()
-        files["guid"] = [uuid.uuid4() for n in range(len(files))]
+        files["guid"] = [str(uuid.uuid4()) for n in range(len(files))]
         def apply_transform(files, transform_name):
             transform = transforms[transform_name]
@@ -166,6 +166,9 @@ class BenchmarkImageDataset(BenchmarkDataset):
                     continue
                 try:
                     transformed = transform(image=image)
+                    # If albumentations, output is a dict with 'image' key
+                    if isinstance(transformed, dict) and "image" in transformed:
+                        transformed = transformed["image"]
                 except Exception as e:
                     raise RuntimeError(
                         f"An exception occurred while processing {filepath} "

{perception-0.8.2 → perception-0.8.4}/perception/benchmarking/image_transforms.py RENAMED Viewed

@@ -17,7 +17,7 @@ def apply_watermark(watermark, alpha: float = 1.0, size: float = 1.0):
     # Why do we have to do this? It's not clear. But the process doesn't work
     # without it.
-    (B, G, R, A) = cv2.split(watermark)
+    B, G, R, A = cv2.split(watermark)
     B = cv2.bitwise_and(B, B, mask=A)
     G = cv2.bitwise_and(G, G, mask=A)
     R = cv2.bitwise_and(R, R, mask=A)
@@ -25,7 +25,7 @@ def apply_watermark(watermark, alpha: float = 1.0, size: float = 1.0):
     def transform(image):
         # Add alpha channel
-        (h, w) = image.shape[:2]
+        h, w = image.shape[:2]
         wh, ww = watermark.shape[:2]
         scale = size * min(h / wh, w / ww)
         image = np.dstack([image, np.ones((h, w), dtype="uint8") * 255])

{perception-0.8.2 → perception-0.8.4}/perception/benchmarking/video.py RENAMED Viewed

@@ -94,7 +94,7 @@ class BenchmarkVideoDataset(BenchmarkDataset):
         os.makedirs(storage_dir, exist_ok=True)
         files = self._df.copy()
-        files["guid"] = [uuid.uuid4() for n in range(len(files))]
+        files["guid"] = [str(uuid.uuid4()) for n in range(len(files))]
         def apply_transform_to_file(input_filepath, guid, transform_name, category):
             if input_filepath is None:

{perception-0.8.2 → perception-0.8.4}/perception/hashers/__init__.py RENAMED Viewed

@@ -7,7 +7,6 @@ from .image.wavelet import WaveletHash
 from .video.framewise import FramewiseHasher
 from .video.tmk import TMKL1, TMKL2
 __all__ = [
     "ImageHasher",
     "VideoHasher",
@@ -24,3 +23,10 @@ __all__ = [
     "PHashU8",
     "PHashF",
 ]
+try:
+    from .image.pdq import PDQHash as PDQHash, PDQHashF as PDQHashF
+except ImportError:
+    pass
+else:
+    __all__.extend(["PDQHash", "PDQHashF"])

{perception-0.8.2 → perception-0.8.4}/perception/hashers/image/opencv.py RENAMED Viewed

@@ -24,7 +24,7 @@ class MarrHildreth(OpenCVHasher):
     def __init__(self):
         super().__init__()
-        self.hasher = cv2.img_hash.MarrHildrethHash.create()
+        self.hasher = cv2.img_hash.MarrHildrethHash.create()  # type: ignore[attr-defined]
     def _compute(self, image):
         return np.unpackbits(self.hasher.compute(image)[0])
@@ -40,7 +40,7 @@ class ColorMoment(OpenCVHasher):
     def __init__(self):
         super().__init__()
-        self.hasher = cv2.img_hash.ColorMomentHash.create()
+        self.hasher = cv2.img_hash.ColorMomentHash.create()  # type: ignore[attr-defined]
     def _compute(self, image):
         return 10000 * self.hasher.compute(image)[0]
@@ -56,7 +56,7 @@ class BlockMean(OpenCVHasher):
     def __init__(self):
         super().__init__()
-        self.hasher = cv2.img_hash.BlockMeanHash.create(1)
+        self.hasher = cv2.img_hash.BlockMeanHash.create(1)  # type: ignore[attr-defined]
     def _compute(self, image):
         # https://stackoverflow.com/questions/54762896/why-cv2-norm-hamming-gives-different-value-than-actual-hamming-distance

{perception-0.8.2 → perception-0.8.4}/perception/hashers/tools.py RENAMED Viewed

@@ -11,6 +11,7 @@ import os
 import queue
 import shlex
 import subprocess
+import tempfile
 import threading
 import typing
 import warnings
@@ -27,7 +28,9 @@ import validators
 LOGGER = logging.getLogger(__name__)
-ImageInputType = typing.Union[str, np.ndarray, "PIL.Image.Image", io.BytesIO]
+ImageInputType = typing.Union[
+    str, np.ndarray, "PIL.Image.Image", io.BytesIO, tempfile.SpooledTemporaryFile
+]
 SIZES = {"float32": 32, "uint8": 8, "bool": 1}
@@ -357,7 +360,10 @@ def read(filepath_or_buffer: ImageInputType, timeout=None) -> np.ndarray:
     """
     if isinstance(filepath_or_buffer, PIL.Image.Image):
         return np.array(filepath_or_buffer.convert("RGB"))
-    if isinstance(filepath_or_buffer, (io.BytesIO, client.HTTPResponse)):
+    if isinstance(
+        filepath_or_buffer,
+        (io.BytesIO, client.HTTPResponse, tempfile.SpooledTemporaryFile),
+    ):
         image = np.asarray(bytearray(filepath_or_buffer.read()), dtype=np.uint8)
         decoded_image = cv2.imdecode(image, cv2.IMREAD_UNCHANGED)
     elif isinstance(filepath_or_buffer, str):
@@ -561,9 +567,7 @@ def read_video_to_generator_ffmpeg(
             ), f"Invalid framerate: {frames_per_second}"
             seconds_per_frame = 1 / frames_per_second
             filters.append(
-                f"fps={frames_per_second}:"
-                f"round={frame_rounding}:"
-                f"start_time={offset}"
+                f"fps={frames_per_second}:round={frame_rounding}:start_time={offset}"
             )
         # Add resizing filters.
         if use_cuda and codec_name in CUDA_CODECS:
@@ -601,7 +605,12 @@ def read_video_to_generator_ffmpeg(
                 if not batch:
                     break
                 for image in np.frombuffer(batch, dtype="uint8").reshape(
-                    (-1, height, width, channels)
+                    (
+                        -1,
+                        height,
+                        width,
+                        channels,
+                    )
                 ):
                     if frames_per_second != "keyframes":
                         yield (image, frame_index, timestamp)
@@ -960,137 +969,206 @@ def compute_synchronized_video_hashes(
 def unletterbox(
-    image, only_remove_black: bool = False, min_fraction_meaningful_pixels: float = 0.1
+    image: np.ndarray,
+    only_remove_black: bool = False,
+    min_fraction_meaningful_pixels: float = 0.1,
+    color_threshold: float = 2,
+    min_side_length: int = 50,
+    min_reduction: float = 0.02,
 ) -> tuple[tuple[int, int], tuple[int, int]] | None:
-    """Return bounds of non-trivial region of image or None.
-    Unletterboxing is cropping an image such that trivial edge regions
-    are removed. Trivial in this context means that the majority of
-    the values in that row or column are zero or very close to
-    zero.
-    In order to do unletterboxing, this function returns bounds in the
-    form (x1, x2), (y1, y2) where:
-    - x1 is the index of the first column where over X% of the pixels
-      have means (average of R, G, B) > 2.
-    - x2 is the index of the last column where over X% of the pixels
-      have means > 2.
-    - y1 is the index of the first row where over X% of the pixels
-      have means > 2.
-    - y2 is the index of the last row where over X% of the pixels
-      have means > 2.
-    - X is min_fraction_meaningful_pixels 0.1 == 10%
-    If there are zero columns or zero rows where over X% of the
-    pixels have means > 2, this function returns `None`.
-    Note that in the case(s) of a single column and/or row of
-    non-trivial pixels that it is possible for x1 = x2 and/or y1 = y2.
-    Consider these examples to understand edge cases.  Given two
-    images, `L` (entire left and bottom edges are 1, all other pixels
-    0) and `U` (left, bottom and right edges 1, all other pixels 0),
-    `unletterbox(L)` would return the bounds of the single bottom-left
-    pixel and `unletterbox(U)` would return the bounds of the entire
-    bottom row.
-    Consider `U1` which is the same as `U` but with the bottom two
-    rows all 1s. `unletterbox(U1)` returns the bounds of the bottom
-    two rows.
+    """Return bounds of the non-trivial (content) region of an image, or None.
+    Letterboxing refers to uniform-color borders added around an image
+    (e.g., black bars on a video frame). This function detects such borders
+    by identifying the background color from the image corners and finding
+    the bounding box of pixels that differ from that background.
+    The function returns bounds as ``(x1, x2), (y1, y2)`` suitable for
+    slicing: ``image[y1:y2, x1:x2]``. The bounds are exclusive on the
+    right/bottom (i.e., x2 and y2 point one past the last content pixel).
+    **Algorithm overview:**
+    1. Sample the four corner pixels and find the most common value as
+       the candidate background color. If all four corners differ, return
+       ``None`` (no consistent letterbox detected).
+    2. Build a binary content mask where each pixel whose grayscale
+       intensity differs from the background by more than
+       ``color_threshold`` is marked as content.
+    3. Project the mask onto rows and columns and find the first/last
+       row and column where the fraction of content pixels exceeds
+       ``min_fraction_meaningful_pixels``.
+    4. Validate that the resulting crop is meaningfully smaller than the
+       original (controlled by ``min_reduction``) and that both sides
+       exceed ``min_side_length``.
+    Returns ``None`` when:
+    - No two corners share the same color (no clear background).
+    - Every pixel differs from the detected background (no border).
+    - No row or column meets the content-pixel threshold.
+    - The crop would not remove at least ``min_reduction`` fraction
+      from any dimension.
+    - Either cropped dimension would be smaller than ``min_side_length``.
     Args:
-        image: The image from which to remove letterboxing.
-        only_remove_black: Set False to remove borders fo any color.
-        min_fraction_meaningful_pixels: 0 to 1: if cropped version is
-        smaller than this fraction of the image do not unletterbox.
-        0.1 == 10% of the image.
+        image: Input image as an ``np.ndarray``. May be grayscale (H×W)
+            or RGB (H×W×3); RGB images are converted to grayscale
+            internally for background detection.
+        only_remove_black: If ``True``, treat black (intensity 0) as the
+            background regardless of corner colors. If ``False`` (default),
+            infer the background color from the most common corner value.
+        min_fraction_meaningful_pixels: The minimum fraction (0–1) of
+            pixels in a row or column that must differ from the background
+            for that row/column to be considered part of the content region.
+            Defaults to 0.1 (10%).
+        color_threshold: The minimum absolute difference in grayscale
+            intensity between a pixel and the background color for that
+            pixel to be classified as content. Defaults to 2.
+        min_side_length: The minimum width or height (in pixels) of the
+            cropped region. If the crop would be smaller, ``None`` is
+            returned. Defaults to 50.
+        min_reduction: The minimum fraction (0–1) of the original width
+            or height that must be removed for the crop to be worthwhile.
+            If the crop removes less than this from both dimensions,
+            ``None`` is returned. Defaults to 0.02 (2%).
     Returns:
-        A pair of coordinates bounds of the form (x1, x2)
-        and (y1, y2) representing the left, right, top, and
-        bottom bounds.
+        A tuple ``((x1, x2), (y1, y2))`` giving the left, right, top,
+        and bottom bounds of the content region (right/bottom exclusive),
+        or ``None`` if no meaningful letterbox was detected.
     """
-    assert 0 <= min_fraction_meaningful_pixels <= 1, "min_size must be between 0 and 1"
-    if not only_remove_black:
-        height, width, colors = image.shape
-        bottom = height - 1
-        right = width - 1
-        top = 0
-        left = 0
-        # Generate a count of the corner pixels.
-        counts = Counter(
-            [
-                tuple(image[top, left]),
-                tuple(image[top, right]),
-                tuple(image[bottom, left]),
-                tuple(image[bottom, right]),
-            ]
+    if not 0 <= min_fraction_meaningful_pixels <= 1:
+        raise ValueError("min_fraction_meaningful_pixels must be between 0 and 1")
+    if not 0 <= min_reduction <= 1:
+        raise ValueError("min_reduction must be between 0 and 1")
+    image = image.astype(np.uint8)
+    shape = image.shape
+    h, w = shape[0:2]
+    if len(shape) == 3:
+        image = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
+    # Determine background color and build binary content mask.
+    if only_remove_black:
+        bg_gray = 0
+    else:
+        # Sample the four corner pixels. If all four are unique there is no
+        # consistent background color, so we bail out early (O(1) rejection).
+        corners = (
+            image[0, 0],
+            image[0, w - 1],
+            image[h - 1, 0],
+            image[h - 1, w - 1],
         )
-        if len(counts) == 4:
-            return (0, image.shape[1]), (0, image.shape[0])
-        # Grab reference color.
-        # We grab the most common shared color.
-        bg_color, _ = counts.most_common(1)[0]
-        # Create an image of just that color. dtype to match image.
-        mask = np.ones((height, width, colors), dtype=np.int16)
-        mask[:, :] = np.array(bg_color)
-        # Diff the image so that color is black.
-        image = np.abs(np.subtract(image, mask))
-    # adj should be thought of as a boolean at each pixel indicating
-    # whether or not that pixel is non-trivial (True) or not (False).
-    adj = image.mean(axis=2) > 2
-    if adj.all():
-        return (0, image.shape[1] + 1), (0, image.shape[0] + 1)
+        if len(set(corners)) == 4:
+            LOGGER.debug("No common corner color detected, skipping content detection.")
+            return (
+                (0, w),
+                (0, h),
+            )  # Return full image bounds instead of None to maintain backwards compatibility
+        # Use the most common corner value as the background intensity.
+        counts = Counter(corners)
+        bg_gray = counts.most_common(1)[0][0]
+    # Mark pixels whose grayscale intensity differs from the background
+    # by more than color_threshold as content (True).
+    content_mask = np.abs(image.astype(np.int16) - bg_gray) > color_threshold
+    # If every pixel is classified as content, there is no border to remove.
+    if content_mask.all():
+        LOGGER.debug("All pixels differ from background; no letterbox detected.")
+        return (
+            (0, w),
+            (0, h),
+        )  # Return full image bounds instead of None to maintain backwards compatibility
+    # Find the content bounding box by projecting the mask onto rows and
+    # columns. cv2.reduce is used instead of np.sum for performance.
+    mask_u8 = content_mask.astype(np.uint8)
+    row_content = cv2.reduce(mask_u8, 1, cv2.REDUCE_SUM, dtype=cv2.CV_32S).ravel()
+    col_content = cv2.reduce(mask_u8, 0, cv2.REDUCE_SUM, dtype=cv2.CV_32S).ravel()
+    # Thresholds for minimum content per row/column
+    row_threshold = min_fraction_meaningful_pixels * w
+    col_threshold = min_fraction_meaningful_pixels * h
+    # Find first/last rows and columns with sufficient content
+    content_rows = np.where(row_content > row_threshold)[0]
+    content_cols = np.where(col_content > col_threshold)[0]
+    if len(content_rows) == 0 or len(content_cols) == 0:
+        LOGGER.debug("No rows or columns with sufficient content detected.")
+        return None
-    # Find rows and cols with at least min_fraction_meaningful_pixels.
-    y = np.where(adj.sum(axis=1) > min_fraction_meaningful_pixels * image.shape[1])[0]
-    x = np.where(adj.sum(axis=0) > min_fraction_meaningful_pixels * image.shape[0])[0]
+    top = int(content_rows[0])
+    bottom = int(content_rows[-1]) + 1
+    left = int(content_cols[0])
+    right = int(content_cols[-1]) + 1
+    height = bottom - top
+    width = right - left
-    # Either no rows or no columns had enough meaningful information to keep.
-    if len(y) == 0 or len(x) == 0:
+    # Reject if the crop does not remove at least min_reduction from
+    # at least one dimension (i.e., the border is negligibly thin).
+    if width >= w * (1 - min_reduction) and height >= h * (1 - min_reduction):
+        LOGGER.debug(
+            "Crop would not reduce either dimension by %.0f%%; skipping.",
+            min_reduction * 100,
+        )
+        return (
+            (0, w),
+            (0, h),
+        )  # Return full image bounds instead of None to maintain backwards compatibility
+    # Reject if the remaining content region is too small to be useful.
+    if width < min_side_length or height < min_side_length:
+        LOGGER.debug(
+            "Cropped region (%dx%d) smaller than min_side_length=%d; skipping.",
+            width,
+            height,
+            min_side_length,
+        )
         return None
-    if len(y) == 1:
-        y1 = y2 = y[0]
-    else:
-        y1, y2 = y[[0, -1]]
-    if len(x) == 1:
-        x1 = x2 = x[0]
-    else:
-        x1, x2 = x[[0, -1]]
-    bounds = (x1, x2 + 1), (y1, y2 + 1)
-    return bounds
+    return ((left, right), (top, bottom))
 def unletterbox_crop(
-    image: np.ndarray, min_fraction_meaningful_pixels: float = 0.1
+    image: np.ndarray,
+    min_fraction_meaningful_pixels: float = 0.1,
+    color_threshold: float = 2,
+    min_side_length: int = 50,
+    min_reduction: float = 0.02,
 ) -> np.ndarray | None:
     """Detect and crop the letterboxed regions from an image.
     Args:
         image: The image from which to remove letterboxing.
         min_fraction_meaningful_pixels: 0 to 1: if cropped version is
-        smaller than this fraction of the image do not unletterbox.
-        0.1 == 10% of the image.
+            smaller than this fraction of the image do not unletterbox.
+            0.1 == 10% of the image.
+        color_threshold: The minimum absolute difference in grayscale
+            intensity between a pixel and the background color for that
+            pixel to be classified as content. Defaults to 2.
+        min_side_length: The minimum width or height (in pixels) of the
+            cropped region. If the crop would be smaller, ``None`` is
+            returned. Defaults to 50.
+        min_reduction: The minimum fraction (0–1) of the original width
+            or height that must be removed for the crop to be worthwhile.
+            If the crop removes less than this from both dimensions,
+            the original image is returned. Defaults to 0.02 (2%).
     Returns:
         The cropped image or None if the image is mostly blank space.
     """
-    assert isinstance(
-        image, np.ndarray
-    ), "Please send np.ndarray to unletterbox_image()."
+    if not isinstance(image, np.ndarray):
+        raise TypeError(f"Expected np.ndarray, got {type(image).__name__}")
     bounds = unletterbox(
-        image, min_fraction_meaningful_pixels=min_fraction_meaningful_pixels
+        image,
+        min_fraction_meaningful_pixels=min_fraction_meaningful_pixels,
+        color_threshold=color_threshold,
+        min_side_length=min_side_length,
+        min_reduction=min_reduction,
     )
     if bounds is None:
         return None

{perception-0.8.2 → perception-0.8.4}/perception/hashers/video/tmk.py RENAMED Viewed

@@ -56,7 +56,7 @@ class TMKL2(VideoHasher):
                 for i in range(1, self.m)
             ]
         )
-        a = a.reshape(1, -1).repeat(repeats=len(self.T), axis=0)
+        a = a.reshape(1, -1).repeat(repeats=len(self.T), axis=0)  # type: ignore
         a = np.sqrt(a)
         self.a = a[..., np.newaxis]
@@ -77,7 +77,12 @@ class TMKL2(VideoHasher):
     def hash_from_final_state(self, state):
         timestamps = np.array(state["timestamps"])
         features = np.array(state["features"]).reshape(
-            (1, 1, timestamps.shape[0], self.frame_hasher.hash_length)
+            (
+                1,
+                1,
+                timestamps.shape[0],
+                self.frame_hasher.hash_length,
+            )
         )
         x = self.ms_normed * timestamps
         yw1 = np.sin(x) * self.a

{perception-0.8.2 → perception-0.8.4}/perception/testing/__init__.py RENAMED Viewed

@@ -101,8 +101,8 @@ def hash_dicts_to_df(hash_dicts, returns_multiple):
                 ),
                 "hash": tools.flatten([h["hash"] for h in hash_dicts]),
             }
-        ).assign(error=None)
-    return pd.DataFrame.from_records(hash_dicts).assign(error=None)
+        ).assign(error=np.nan)
+    return pd.DataFrame.from_records(hash_dicts).assign(error=np.nan)
 def test_hasher_parallelization(hasher, test_filepaths):
@@ -156,7 +156,9 @@ def test_image_hasher_integrity(
     image2 = test_images[1]
     hash1_1 = str(hasher.compute(image1))  # str() games for mypy, not proud
     hash1_2 = str(hasher.compute(Image.open(image1)))
-    hash1_3 = str(hasher.compute(cv2.cvtColor(cv2.imread(image1), cv2.COLOR_BGR2RGB)))
+    image_cv = cv2.imread(image1)
+    assert image_cv is not None, f"Failed to load image: {image1}"
+    hash1_3 = str(hasher.compute(cv2.cvtColor(image_cv, cv2.COLOR_BGR2RGB)))
     hash2_1 = str(hasher.compute(image2))

perception-0.8.4/perception/testing/videos/extra_channel_attached_pic.mp4 ADDED Viewed

Binary file

perception-0.8.4/perception/testing/videos/extra_channel_attached_pic_audio.mp4 ADDED Viewed

Binary file

{perception-0.8.2 → perception-0.8.4}/pyproject.toml RENAMED Viewed

@@ -1,55 +1,46 @@
-[tool.poetry]
+[project]
 name = "Perception"
-version = "0.8.2"
+dynamic = []
 description = "Perception provides flexible, well-documented, and comprehensively tested tooling for perceptual hashing research, development, and production use."
-authors = ["Thorn <info@wearethorn.org>"]
-license = "Apache License 2.0"
+authors = [{ name = "Thorn", email = "info@wearethorn.org" }]
+license = "Apache-2.0"
 readme = "README.md"
+requires-python = ">=3.10,<4.0"
+dependencies = [
+  "Cython>=3.0.0,<4.0.0",
+  "numpy>=1.26.4,<3.0.0",
+  "opencv-contrib-python-headless>=4.10.0,<5.0.0",
+  "faiss-cpu>=1.8.0,<2.0.0",
+  "networkit>=11.1,<12.0.0; sys_platform != 'darwin'",
+  "networkx>=3.0,<4.0; sys_platform == 'darwin'",
+  "pandas",
+  "Pillow",
+  "pywavelets>=1.5.0,<2.0.0",
+  "validators>=0.22.0,<1.0.0",
+  "rich>=13.7.0,<14.0.0",
+  "scipy",
+  "tqdm>=4.67.1,<5.0.0",
+]
+version = "0.8.4"
-[tool.poetry.dependencies]
-python = "^3.10"
-Cython = "^3"
-numpy = "^1.26"
-opencv-contrib-python-headless = "^4.10"
-pandas = "*"
-pdqhash = "*"
-Pillow = "*"
-pywavelets = "^1.5.0"
-tqdm = "*"
-validators = ">=0.22, <1.0"
-scipy = "*"
-# Benchmarking Extras
-matplotlib = { version = "*", optional = true }
-imgaug = { version = "*", optional = true }
-tabulate = { version = "*", optional = true }
-scikit-learn = { version = "*", optional = true }
-ffmpeg-python = { version = "*", optional = true }
-# Matching Extras
-aiohttp = { version = "*", optional = true }
-python-json-logger = { version = "*", optional = true }
-rich = "^13.7.0"
-# Experimental Extras
-networkit = { version = "^11", optional = true }
-faiss-cpu = { version = "^1.8.0.post1", optional = true }
-[tool.poetry.extras]
+[project.optional-dependencies]
 benchmarking = [
   "matplotlib",
-  "scipy",
-  "imgaug",
+  "albumentations>=2.0.8,<3.0.0",
   "tabulate",
   "scikit-learn",
   "ffmpeg-python",
 ]
 matching = ["aiohttp", "python-json-logger"]
-experimental = ["networkit", "faiss-cpu"]
+pdq = ["pdqhash>=0.2.7,<0.3.0"]
+[tool.poetry]
 [tool.poetry.group.dev.dependencies]
-black = "^24"
+black = "^26"
 coverage = "*"
 ipython = "*"
 mypy = "*"
@@ -61,6 +52,7 @@ ruff = "*"
 types-pillow = "*"
 types-tqdm = "*"
 twine = "*"
+albumentations = "^2.0.8"
 [tool.poetry.build]
@@ -74,6 +66,7 @@ ignore_missing_imports = true
 [tool.poetry-dynamic-versioning]
 enable = false
+vcs = "git"
 [build-system]
 requires = [

{perception-0.8.2 → perception-0.8.4}/setup.py RENAMED Viewed

@@ -14,30 +14,32 @@ package_data = \
 {'': ['*'], 'perception.testing': ['images/*', 'logos/*', 'videos/*']}
 install_requires = \
-['Cython>=3,<4',
+['Cython>=3.0.0,<4.0.0',
  'Pillow',
- 'numpy>=1.26,<2.0',
- 'opencv-contrib-python-headless>=4.10,<5.0',
+ 'faiss-cpu>=1.8.0,<2.0.0',
+ 'numpy>=1.26.4,<3.0.0',
+ 'opencv-contrib-python-headless>=4.10.0,<5.0.0',
  'pandas',
- 'pdqhash',
  'pywavelets>=1.5.0,<2.0.0',
  'rich>=13.7.0,<14.0.0',
- 'tqdm',
- 'validators>=0.22,<1.0']
+ 'scipy',
+ 'tqdm>=4.67.1,<5.0.0',
+ 'validators>=0.22.0,<1.0.0']
 extras_require = \
-{':extra == "benchmarking"': ['scipy'],
+{':sys_platform != "darwin"': ['networkit>=11.1,<12.0.0'],
+ ':sys_platform == "darwin"': ['networkx>=3.0,<4.0'],
  'benchmarking': ['matplotlib',
-                  'imgaug',
+                  'albumentations>=2.0.8,<3.0.0',
                   'tabulate',
                   'scikit-learn',
                   'ffmpeg-python'],
- 'experimental': ['networkit>=11,<12', 'faiss-cpu>=1.8.0.post1,<2.0.0'],
- 'matching': ['aiohttp', 'python-json-logger']}
+ 'matching': ['aiohttp', 'python-json-logger'],
+ 'pdq': ['pdqhash>=0.2.7,<0.3.0']}
 setup_kwargs = {
     'name': 'Perception',
-    'version': '0.8.2',
+    'version': '0.8.4',
     'description': 'Perception provides flexible, well-documented, and comprehensively tested tooling for perceptual hashing research, development, and production use.',
     'long_description': "# perception ![ci](https://github.com/thorn-oss/perception/workflows/ci/badge.svg)\n\n`perception` provides flexible, well-documented, and comprehensively tested tooling for perceptual hashing research, development, and production use. See [the documentation](https://perception.thorn.engineering/en/latest/) for details.\n\n## Background\n\n`perception` was initially developed at [Thorn](https://www.thorn.org) as part of our work to eliminate child sexual abuse material from the internet. For more information on the issue, check out [our CEO's TED talk](https://www.thorn.org/blog/time-is-now-eliminate-csam/).\n\n## Getting Started\n\n### Installation\n\n`pip install perception`\n\n### Hashing\n\nHashing with different functions is simple with `perception`.\n\n```python\nfrom perception import hashers\n\nfile1, file2 = 'test1.jpg', 'test2.jpg'\nhasher = hashers.PHash()\nhash1, hash2 = hasher.compute(file1), hasher.compute(file2)\ndistance = hasher.compute_distance(hash1, hash2)\n```\n\n### Examples\n\nSee below for end-to-end examples for common use cases for perceptual hashes.\n\n- [Detecting child sexual abuse material](https://perception.thorn.engineering/en/latest/examples/detecting_csam.html)\n- [Deduplicating media](https://perception.thorn.engineering/en/latest/examples/deduplication.html)\n- [Benchmarking perceptual hashes](https://perception.thorn.engineering/en/latest/examples/benchmarking.html)\n\n## Supported Hashing Algorithms\n\n`perception` currently ships with:\n\n- pHash (DCT hash) (`perception.hashers.PHash`)\n- Facebook's PDQ Hash (`perception.hashers.PDQ`)\n- dHash (difference hash) (`perception.hashers.DHash`)\n- aHash (average hash) (`perception.hashers.AverageHash`)\n- Marr-Hildreth (`perception.hashers.MarrHildreth`)\n- Color Moment (`perception.hashers.ColorMoment`)\n- Block Mean (`perception.hashers.BlockMean`)\n- wHash (wavelet hash) (`perception.hashers.WaveletHash`)\n\n## Contributing\n\nTo work on the project, start by doing the following.\n\n```bash\n# Install local dependencies for\n# code completion, etc.\nmake init\n\n- To do a (close to) comprehensive check before committing code, you can use `make precommit`.\n\nTo implement new features, please first file an issue proposing your change for discussion.\n\nTo report problems, please file an issue with sample code, expected results, actual results, and a complete traceback.\n\n## Alternatives\n\nThere are other packages worth checking out to see if they meet your needs for perceptual hashing. Here are some\nexamples.\n\n- [dedupe](https://github.com/dedupeio/dedupe)\n- [imagededup](https://idealo.github.io/imagededup/)\n- [ImageHash](https://github.com/JohannesBuchner/imagehash)\n- [PhotoHash](https://github.com/bunchesofdonald/photohash)\n```\n",
     'author': 'Thorn',