Perception 0.8.2__tar.gz → 0.8.4__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {perception-0.8.2 → perception-0.8.4}/PKG-INFO +14 -14
- {perception-0.8.2 → perception-0.8.4}/build.py +0 -1
- {perception-0.8.2 → perception-0.8.4}/perception/approximate_deduplication/__init__.py +22 -47
- perception-0.8.4/perception/approximate_deduplication/_graph_backend.py +138 -0
- {perception-0.8.2 → perception-0.8.4}/perception/approximate_deduplication/debug.py +2 -4
- {perception-0.8.2 → perception-0.8.4}/perception/benchmarking/common.py +9 -3
- {perception-0.8.2 → perception-0.8.4}/perception/benchmarking/image.py +7 -4
- {perception-0.8.2 → perception-0.8.4}/perception/benchmarking/image_transforms.py +2 -2
- {perception-0.8.2 → perception-0.8.4}/perception/benchmarking/video.py +1 -1
- {perception-0.8.2 → perception-0.8.4}/perception/hashers/__init__.py +7 -1
- {perception-0.8.2 → perception-0.8.4}/perception/hashers/image/opencv.py +3 -3
- {perception-0.8.2 → perception-0.8.4}/perception/hashers/tools.py +189 -111
- {perception-0.8.2 → perception-0.8.4}/perception/hashers/video/tmk.py +7 -2
- {perception-0.8.2 → perception-0.8.4}/perception/testing/__init__.py +5 -3
- perception-0.8.4/perception/testing/videos/extra_channel_attached_pic.mp4 +0 -0
- perception-0.8.4/perception/testing/videos/extra_channel_attached_pic_audio.mp4 +0 -0
- {perception-0.8.2 → perception-0.8.4}/pyproject.toml +30 -37
- {perception-0.8.2 → perception-0.8.4}/setup.py +13 -11
- {perception-0.8.2 → perception-0.8.4}/LICENSE +0 -0
- {perception-0.8.2 → perception-0.8.4}/README.md +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/__init__.py +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/approximate_deduplication/index.py +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/approximate_deduplication/serve.py +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/benchmarking/__init__.py +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/benchmarking/extensions.pyx +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/benchmarking/video_transforms.py +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/extensions.pyx +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/hashers/hasher.py +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/hashers/image/__init__.py +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/hashers/image/average.py +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/hashers/image/dhash.py +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/hashers/image/pdq.py +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/hashers/image/phash.py +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/hashers/image/wavelet.py +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/hashers/video/__init__.py +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/hashers/video/framewise.py +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/local_descriptor_deduplication.py +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/py.typed +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/images/README.md +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/images/image1.jpg +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/images/image10.jpg +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/images/image2.jpg +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/images/image3.jpg +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/images/image4.jpg +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/images/image5.jpg +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/images/image6.jpg +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/images/image7.jpg +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/images/image8.jpg +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/images/image9.jpg +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/logos/README.md +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/logos/logoipsum.png +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/videos/README.md +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/videos/expected_tmk.json.gz +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/videos/rgb.m4v +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/videos/v1.m4v +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/videos/v2.m4v +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/testing/videos/v2s.mov +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/tools.py +0 -0
- {perception-0.8.2 → perception-0.8.4}/perception/utils.py +0 -0
|
@@ -1,13 +1,12 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: Perception
|
|
3
|
-
Version: 0.8.
|
|
3
|
+
Version: 0.8.4
|
|
4
4
|
Summary: Perception provides flexible, well-documented, and comprehensively tested tooling for perceptual hashing research, development, and production use.
|
|
5
|
-
License: Apache-2.0
|
|
5
|
+
License-Expression: Apache-2.0
|
|
6
6
|
License-File: LICENSE
|
|
7
7
|
Author: Thorn
|
|
8
8
|
Author-email: info@wearethorn.org
|
|
9
9
|
Requires-Python: >=3.10,<4.0
|
|
10
|
-
Classifier: License :: OSI Approved :: Apache Software License
|
|
11
10
|
Classifier: Programming Language :: Python :: 3
|
|
12
11
|
Classifier: Programming Language :: Python :: 3.10
|
|
13
12
|
Classifier: Programming Language :: Python :: 3.11
|
|
@@ -15,28 +14,29 @@ Classifier: Programming Language :: Python :: 3.12
|
|
|
15
14
|
Classifier: Programming Language :: Python :: 3.13
|
|
16
15
|
Classifier: Programming Language :: Python :: 3.14
|
|
17
16
|
Provides-Extra: benchmarking
|
|
18
|
-
Provides-Extra: experimental
|
|
19
17
|
Provides-Extra: matching
|
|
20
|
-
|
|
18
|
+
Provides-Extra: pdq
|
|
19
|
+
Requires-Dist: Cython (>=3.0.0,<4.0.0)
|
|
21
20
|
Requires-Dist: Pillow
|
|
22
21
|
Requires-Dist: aiohttp ; extra == "matching"
|
|
23
|
-
Requires-Dist:
|
|
22
|
+
Requires-Dist: albumentations (>=2.0.8,<3.0.0) ; extra == "benchmarking"
|
|
23
|
+
Requires-Dist: faiss-cpu (>=1.8.0,<2.0.0)
|
|
24
24
|
Requires-Dist: ffmpeg-python ; extra == "benchmarking"
|
|
25
|
-
Requires-Dist: imgaug ; extra == "benchmarking"
|
|
26
25
|
Requires-Dist: matplotlib ; extra == "benchmarking"
|
|
27
|
-
Requires-Dist: networkit (>=11,<12) ;
|
|
28
|
-
Requires-Dist:
|
|
29
|
-
Requires-Dist:
|
|
26
|
+
Requires-Dist: networkit (>=11.1,<12.0.0) ; sys_platform != "darwin"
|
|
27
|
+
Requires-Dist: networkx (>=3.0,<4.0) ; sys_platform == "darwin"
|
|
28
|
+
Requires-Dist: numpy (>=1.26.4,<3.0.0)
|
|
29
|
+
Requires-Dist: opencv-contrib-python-headless (>=4.10.0,<5.0.0)
|
|
30
30
|
Requires-Dist: pandas
|
|
31
|
-
Requires-Dist: pdqhash
|
|
31
|
+
Requires-Dist: pdqhash (>=0.2.7,<0.3.0) ; extra == "pdq"
|
|
32
32
|
Requires-Dist: python-json-logger ; extra == "matching"
|
|
33
33
|
Requires-Dist: pywavelets (>=1.5.0,<2.0.0)
|
|
34
34
|
Requires-Dist: rich (>=13.7.0,<14.0.0)
|
|
35
35
|
Requires-Dist: scikit-learn ; extra == "benchmarking"
|
|
36
|
-
Requires-Dist: scipy
|
|
36
|
+
Requires-Dist: scipy
|
|
37
37
|
Requires-Dist: tabulate ; extra == "benchmarking"
|
|
38
|
-
Requires-Dist: tqdm
|
|
39
|
-
Requires-Dist: validators (>=0.22,<1.0)
|
|
38
|
+
Requires-Dist: tqdm (>=4.67.1,<5.0.0)
|
|
39
|
+
Requires-Dist: validators (>=0.22.0,<1.0.0)
|
|
40
40
|
Description-Content-Type: text/markdown
|
|
41
41
|
|
|
42
42
|
# perception 
|
|
@@ -4,11 +4,12 @@ import os.path as op
|
|
|
4
4
|
import typing
|
|
5
5
|
|
|
6
6
|
import faiss
|
|
7
|
-
import networkit as nk
|
|
8
7
|
import numpy as np
|
|
9
8
|
import tqdm
|
|
10
9
|
import typing_extensions
|
|
11
10
|
|
|
11
|
+
from ._graph_backend import get_graph_backend
|
|
12
|
+
|
|
12
13
|
LOGGER = logging.getLogger(__name__)
|
|
13
14
|
DEFAULT_PCT_PROBE = 0
|
|
14
15
|
|
|
@@ -227,16 +228,13 @@ def pairs_to_clusters(
|
|
|
227
228
|
node_to_id_map = {v: k for k, v in id_to_node_map.items()}
|
|
228
229
|
|
|
229
230
|
LOGGER.debug("Building graph.")
|
|
230
|
-
graph = nk.Graph(len(list_ids))
|
|
231
231
|
node_pairs = {(id_to_node_map[pair[0]], id_to_node_map[pair[1]]) for pair in pairs}
|
|
232
|
-
|
|
233
|
-
|
|
232
|
+
backend = get_graph_backend()
|
|
233
|
+
graph = backend.build_graph(len(list_ids), node_pairs)
|
|
234
234
|
|
|
235
235
|
assignments: list[ClusterAssignment] = []
|
|
236
236
|
cluster_index = 0
|
|
237
|
-
|
|
238
|
-
cc_query.run()
|
|
239
|
-
components = cc_query.getComponents()
|
|
237
|
+
components = backend.connected_components(graph)
|
|
240
238
|
|
|
241
239
|
for component in components:
|
|
242
240
|
LOGGER.debug("Got component with size: %s", len(component))
|
|
@@ -246,19 +244,9 @@ def pairs_to_clusters(
|
|
|
246
244
|
)
|
|
247
245
|
cluster_index += 1
|
|
248
246
|
continue
|
|
249
|
-
|
|
250
|
-
|
|
251
|
-
|
|
252
|
-
algo = nk.community.PLP(cc_sub_graph)
|
|
253
|
-
algo.run()
|
|
254
|
-
communities = algo.getPartition()
|
|
255
|
-
community_map = communities.subsetSizeMap()
|
|
256
|
-
for community, size in community_map.items():
|
|
257
|
-
LOGGER.debug("Got community with size: %s", size)
|
|
258
|
-
community_members = list(
|
|
259
|
-
communities.getMembers(community)
|
|
260
|
-
) # Need to do this to do batching.
|
|
261
|
-
community_members = [component_node_map[i] for i in community_members]
|
|
247
|
+
communities = backend.communities(graph, component)
|
|
248
|
+
for community_members in communities:
|
|
249
|
+
LOGGER.debug("Got community with size: %s", len(community_members))
|
|
262
250
|
if strictness == "community":
|
|
263
251
|
assignments.extend(
|
|
264
252
|
[
|
|
@@ -269,33 +257,20 @@ def pairs_to_clusters(
|
|
|
269
257
|
cluster_index += 1
|
|
270
258
|
continue
|
|
271
259
|
|
|
272
|
-
for
|
|
273
|
-
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
|
|
277
|
-
|
|
278
|
-
|
|
279
|
-
|
|
280
|
-
|
|
260
|
+
for clique_members in backend.maximal_cliques(
|
|
261
|
+
graph,
|
|
262
|
+
community_members,
|
|
263
|
+
max_clique_batch_size=max_clique_batch_size,
|
|
264
|
+
):
|
|
265
|
+
assignments.extend(
|
|
266
|
+
[
|
|
267
|
+
{
|
|
268
|
+
"id": node_to_id_map[n],
|
|
269
|
+
"cluster": cluster_index,
|
|
270
|
+
}
|
|
271
|
+
for n in clique_members
|
|
272
|
+
]
|
|
281
273
|
)
|
|
282
|
-
|
|
283
|
-
while subgraph.numberOfNodes() > 0:
|
|
284
|
-
LOGGER.debug("Subgraph size: %s", subgraph.numberOfNodes())
|
|
285
|
-
clique = nk.clique.MaximalCliques(subgraph, maximumOnly=True)
|
|
286
|
-
clique.run()
|
|
287
|
-
clique_members = clique.getCliques()[0]
|
|
288
|
-
assignments.extend(
|
|
289
|
-
[
|
|
290
|
-
{
|
|
291
|
-
"id": node_to_id_map[community_node_map[n]],
|
|
292
|
-
"cluster": cluster_index,
|
|
293
|
-
}
|
|
294
|
-
for n in clique_members
|
|
295
|
-
]
|
|
296
|
-
)
|
|
297
|
-
cluster_index += 1
|
|
298
|
-
for n in clique_members:
|
|
299
|
-
subgraph.removeNode(n)
|
|
274
|
+
cluster_index += 1
|
|
300
275
|
|
|
301
276
|
return assignments
|
|
@@ -0,0 +1,138 @@
|
|
|
1
|
+
import sys
|
|
2
|
+
import typing
|
|
3
|
+
from abc import ABC, abstractmethod
|
|
4
|
+
|
|
5
|
+
|
|
6
|
+
class GraphBackend(ABC):
|
|
7
|
+
@abstractmethod
|
|
8
|
+
def build_graph(
|
|
9
|
+
self, node_count: int, edges: typing.Iterable[tuple[int, int]]
|
|
10
|
+
) -> typing.Any: ...
|
|
11
|
+
|
|
12
|
+
@abstractmethod
|
|
13
|
+
def connected_components(self, graph: typing.Any) -> list[list[int]]: ...
|
|
14
|
+
|
|
15
|
+
@abstractmethod
|
|
16
|
+
def communities(
|
|
17
|
+
self, graph: typing.Any, component: list[int]
|
|
18
|
+
) -> list[list[int]]: ...
|
|
19
|
+
|
|
20
|
+
@abstractmethod
|
|
21
|
+
def maximal_cliques(
|
|
22
|
+
self,
|
|
23
|
+
graph: typing.Any,
|
|
24
|
+
community_nodes: list[int],
|
|
25
|
+
max_clique_batch_size: int,
|
|
26
|
+
) -> list[list[int]]: ...
|
|
27
|
+
|
|
28
|
+
|
|
29
|
+
class NetworkitGraphBackend(GraphBackend):
|
|
30
|
+
def __init__(self):
|
|
31
|
+
import networkit as nk
|
|
32
|
+
|
|
33
|
+
self.nk = nk
|
|
34
|
+
|
|
35
|
+
def build_graph(
|
|
36
|
+
self, node_count: int, edges: typing.Iterable[tuple[int, int]]
|
|
37
|
+
) -> typing.Any:
|
|
38
|
+
graph = self.nk.Graph(node_count)
|
|
39
|
+
for start, end in edges:
|
|
40
|
+
graph.addEdge(start, end)
|
|
41
|
+
return graph
|
|
42
|
+
|
|
43
|
+
def connected_components(self, graph: typing.Any) -> list[list[int]]:
|
|
44
|
+
cc_query = self.nk.components.ConnectedComponents(graph)
|
|
45
|
+
cc_query.run()
|
|
46
|
+
return cc_query.getComponents()
|
|
47
|
+
|
|
48
|
+
def communities(self, graph: typing.Any, component: list[int]) -> list[list[int]]:
|
|
49
|
+
component_node_map = dict(enumerate(component))
|
|
50
|
+
subgraph = self.nk.graphtools.subgraphFromNodes(graph, component, compact=True)
|
|
51
|
+
algo = self.nk.community.PLP(subgraph, maxIterations=32)
|
|
52
|
+
algo.run()
|
|
53
|
+
communities = algo.getPartition()
|
|
54
|
+
return [
|
|
55
|
+
[component_node_map[node] for node in communities.getMembers(community)]
|
|
56
|
+
for community in communities.subsetSizeMap().keys()
|
|
57
|
+
]
|
|
58
|
+
|
|
59
|
+
def maximal_cliques(
|
|
60
|
+
self,
|
|
61
|
+
graph: typing.Any,
|
|
62
|
+
community_nodes: list[int],
|
|
63
|
+
max_clique_batch_size: int,
|
|
64
|
+
) -> list[list[int]]:
|
|
65
|
+
cliques: list[list[int]] = []
|
|
66
|
+
for start in range(0, len(community_nodes), max_clique_batch_size):
|
|
67
|
+
batch_nodes = community_nodes[start : start + max_clique_batch_size]
|
|
68
|
+
community_node_map = dict(enumerate(batch_nodes))
|
|
69
|
+
subgraph = self.nk.graphtools.subgraphFromNodes(
|
|
70
|
+
graph, batch_nodes, compact=True
|
|
71
|
+
)
|
|
72
|
+
|
|
73
|
+
while subgraph.numberOfNodes() > 0:
|
|
74
|
+
clique = self.nk.clique.MaximalCliques(subgraph, maximumOnly=True)
|
|
75
|
+
clique.run()
|
|
76
|
+
clique_members = clique.getCliques()[0]
|
|
77
|
+
cliques.append([community_node_map[node] for node in clique_members])
|
|
78
|
+
for node in clique_members:
|
|
79
|
+
subgraph.removeNode(node)
|
|
80
|
+
|
|
81
|
+
return cliques
|
|
82
|
+
|
|
83
|
+
|
|
84
|
+
class NetworkxGraphBackend(GraphBackend):
|
|
85
|
+
def __init__(self):
|
|
86
|
+
import networkx as nx
|
|
87
|
+
|
|
88
|
+
self.nx = nx
|
|
89
|
+
|
|
90
|
+
def build_graph(
|
|
91
|
+
self, node_count: int, edges: typing.Iterable[tuple[int, int]]
|
|
92
|
+
) -> typing.Any:
|
|
93
|
+
graph = self.nx.Graph()
|
|
94
|
+
graph.add_nodes_from(range(node_count))
|
|
95
|
+
graph.add_edges_from(edges)
|
|
96
|
+
return graph
|
|
97
|
+
|
|
98
|
+
def connected_components(self, graph: typing.Any) -> list[list[int]]:
|
|
99
|
+
return [list(component) for component in self.nx.connected_components(graph)]
|
|
100
|
+
|
|
101
|
+
def communities(self, graph: typing.Any, component: list[int]) -> list[list[int]]:
|
|
102
|
+
subgraph = graph.subgraph(component)
|
|
103
|
+
return [
|
|
104
|
+
list(community)
|
|
105
|
+
for community in self.nx.algorithms.community.asyn_lpa_communities(
|
|
106
|
+
subgraph, seed=0
|
|
107
|
+
)
|
|
108
|
+
]
|
|
109
|
+
|
|
110
|
+
def maximal_cliques(
|
|
111
|
+
self,
|
|
112
|
+
graph: typing.Any,
|
|
113
|
+
community_nodes: list[int],
|
|
114
|
+
max_clique_batch_size: int,
|
|
115
|
+
) -> list[list[int]]:
|
|
116
|
+
cliques: list[list[int]] = []
|
|
117
|
+
for start in range(0, len(community_nodes), max_clique_batch_size):
|
|
118
|
+
batch_nodes = community_nodes[start : start + max_clique_batch_size]
|
|
119
|
+
subgraph = graph.subgraph(batch_nodes).copy()
|
|
120
|
+
|
|
121
|
+
while subgraph.number_of_nodes() > 0:
|
|
122
|
+
clique_members = max(
|
|
123
|
+
self.nx.find_cliques(subgraph),
|
|
124
|
+
key=lambda clique: (
|
|
125
|
+
len(clique),
|
|
126
|
+
tuple(sorted(clique)),
|
|
127
|
+
),
|
|
128
|
+
)
|
|
129
|
+
cliques.append(list(clique_members))
|
|
130
|
+
subgraph.remove_nodes_from(clique_members)
|
|
131
|
+
|
|
132
|
+
return cliques
|
|
133
|
+
|
|
134
|
+
|
|
135
|
+
def get_graph_backend() -> GraphBackend:
|
|
136
|
+
if sys.platform == "darwin":
|
|
137
|
+
return NetworkxGraphBackend()
|
|
138
|
+
return NetworkitGraphBackend()
|
|
@@ -79,10 +79,8 @@ def vizualize_pair(
|
|
|
79
79
|
circle_size=circle_size,
|
|
80
80
|
)
|
|
81
81
|
else:
|
|
82
|
-
LOGGER.warning(
|
|
83
|
-
|
|
84
|
-
won't match perception match points."""
|
|
85
|
-
)
|
|
82
|
+
LOGGER.warning("""No match_metadata provided, recalculating match points,
|
|
83
|
+
won't match perception match points.""")
|
|
86
84
|
img_matched = viz_brute_force(features_1, features_2, img1, img2, ratio=ratio)
|
|
87
85
|
|
|
88
86
|
return img_matched
|
|
@@ -366,7 +366,7 @@ class BenchmarkHashes(Filterable):
|
|
|
366
366
|
)
|
|
367
367
|
X_noop = np.array(
|
|
368
368
|
noops.hash.apply(
|
|
369
|
-
string_to_vector,
|
|
369
|
+
string_to_vector, # type: ignore[arg-type]
|
|
370
370
|
dtype=dtype,
|
|
371
371
|
hash_format="base64",
|
|
372
372
|
hash_length=int(hash_length),
|
|
@@ -502,8 +502,11 @@ class BenchmarkHashes(Filterable):
|
|
|
502
502
|
ax = axs[rowIdx if nrows > 1 else colIdx]
|
|
503
503
|
|
|
504
504
|
# Plot the charts
|
|
505
|
+
inner_keys = ["guid"] + (
|
|
506
|
+
["transform_name"] if "transform_name" in subset.columns else []
|
|
507
|
+
)
|
|
505
508
|
pos, neg = (
|
|
506
|
-
subset.groupby(
|
|
509
|
+
subset.groupby(inner_keys)[
|
|
507
510
|
[
|
|
508
511
|
"distance_to_closest_correct_image",
|
|
509
512
|
"distance_to_closest_incorrect_image",
|
|
@@ -562,8 +565,11 @@ class BenchmarkHashes(Filterable):
|
|
|
562
565
|
grouping = ["category", "transform_name"]
|
|
563
566
|
|
|
564
567
|
def group_func(subset):
|
|
568
|
+
inner_keys = ["guid"] + (
|
|
569
|
+
["transform_name"] if "transform_name" in subset.columns else []
|
|
570
|
+
)
|
|
565
571
|
pos, neg = (
|
|
566
|
-
subset.groupby(
|
|
572
|
+
subset.groupby(inner_keys)[
|
|
567
573
|
[
|
|
568
574
|
"distance_to_closest_correct_image",
|
|
569
575
|
"distance_to_closest_incorrect_image",
|
|
@@ -4,7 +4,7 @@ import uuid
|
|
|
4
4
|
import warnings
|
|
5
5
|
|
|
6
6
|
import cv2
|
|
7
|
-
import
|
|
7
|
+
import albumentations
|
|
8
8
|
import pandas as pd
|
|
9
9
|
from tqdm import tqdm
|
|
10
10
|
|
|
@@ -119,7 +119,7 @@ class BenchmarkImageDataset(BenchmarkDataset):
|
|
|
119
119
|
|
|
120
120
|
def transform(
|
|
121
121
|
self,
|
|
122
|
-
transforms: dict[str,
|
|
122
|
+
transforms: dict[str, albumentations.BasicTransform],
|
|
123
123
|
storage_dir: str,
|
|
124
124
|
errors: str = "raise",
|
|
125
125
|
) -> BenchmarkImageTransforms:
|
|
@@ -129,7 +129,7 @@ class BenchmarkImageDataset(BenchmarkDataset):
|
|
|
129
129
|
transforms: A dictionary of transformations. The only required
|
|
130
130
|
key is `noop` which determines how the original, untransformed
|
|
131
131
|
image is saved. For a true copy, simply make the `noop` key
|
|
132
|
-
`
|
|
132
|
+
`albumentations.NoOp`
|
|
133
133
|
storage_dir: A directory to store all the images along with
|
|
134
134
|
their transformed counterparts.
|
|
135
135
|
errors: How to handle errors reading files. If "raise", exceptions are
|
|
@@ -145,7 +145,7 @@ class BenchmarkImageDataset(BenchmarkDataset):
|
|
|
145
145
|
os.makedirs(storage_dir, exist_ok=True)
|
|
146
146
|
|
|
147
147
|
files = self._df.copy()
|
|
148
|
-
files["guid"] = [uuid.uuid4() for n in range(len(files))]
|
|
148
|
+
files["guid"] = [str(uuid.uuid4()) for n in range(len(files))]
|
|
149
149
|
|
|
150
150
|
def apply_transform(files, transform_name):
|
|
151
151
|
transform = transforms[transform_name]
|
|
@@ -166,6 +166,9 @@ class BenchmarkImageDataset(BenchmarkDataset):
|
|
|
166
166
|
continue
|
|
167
167
|
try:
|
|
168
168
|
transformed = transform(image=image)
|
|
169
|
+
# If albumentations, output is a dict with 'image' key
|
|
170
|
+
if isinstance(transformed, dict) and "image" in transformed:
|
|
171
|
+
transformed = transformed["image"]
|
|
169
172
|
except Exception as e:
|
|
170
173
|
raise RuntimeError(
|
|
171
174
|
f"An exception occurred while processing {filepath} "
|
|
@@ -17,7 +17,7 @@ def apply_watermark(watermark, alpha: float = 1.0, size: float = 1.0):
|
|
|
17
17
|
|
|
18
18
|
# Why do we have to do this? It's not clear. But the process doesn't work
|
|
19
19
|
# without it.
|
|
20
|
-
|
|
20
|
+
B, G, R, A = cv2.split(watermark)
|
|
21
21
|
B = cv2.bitwise_and(B, B, mask=A)
|
|
22
22
|
G = cv2.bitwise_and(G, G, mask=A)
|
|
23
23
|
R = cv2.bitwise_and(R, R, mask=A)
|
|
@@ -25,7 +25,7 @@ def apply_watermark(watermark, alpha: float = 1.0, size: float = 1.0):
|
|
|
25
25
|
|
|
26
26
|
def transform(image):
|
|
27
27
|
# Add alpha channel
|
|
28
|
-
|
|
28
|
+
h, w = image.shape[:2]
|
|
29
29
|
wh, ww = watermark.shape[:2]
|
|
30
30
|
scale = size * min(h / wh, w / ww)
|
|
31
31
|
image = np.dstack([image, np.ones((h, w), dtype="uint8") * 255])
|
|
@@ -94,7 +94,7 @@ class BenchmarkVideoDataset(BenchmarkDataset):
|
|
|
94
94
|
os.makedirs(storage_dir, exist_ok=True)
|
|
95
95
|
|
|
96
96
|
files = self._df.copy()
|
|
97
|
-
files["guid"] = [uuid.uuid4() for n in range(len(files))]
|
|
97
|
+
files["guid"] = [str(uuid.uuid4()) for n in range(len(files))]
|
|
98
98
|
|
|
99
99
|
def apply_transform_to_file(input_filepath, guid, transform_name, category):
|
|
100
100
|
if input_filepath is None:
|
|
@@ -7,7 +7,6 @@ from .image.wavelet import WaveletHash
|
|
|
7
7
|
from .video.framewise import FramewiseHasher
|
|
8
8
|
from .video.tmk import TMKL1, TMKL2
|
|
9
9
|
|
|
10
|
-
|
|
11
10
|
__all__ = [
|
|
12
11
|
"ImageHasher",
|
|
13
12
|
"VideoHasher",
|
|
@@ -24,3 +23,10 @@ __all__ = [
|
|
|
24
23
|
"PHashU8",
|
|
25
24
|
"PHashF",
|
|
26
25
|
]
|
|
26
|
+
|
|
27
|
+
try:
|
|
28
|
+
from .image.pdq import PDQHash as PDQHash, PDQHashF as PDQHashF
|
|
29
|
+
except ImportError:
|
|
30
|
+
pass
|
|
31
|
+
else:
|
|
32
|
+
__all__.extend(["PDQHash", "PDQHashF"])
|
|
@@ -24,7 +24,7 @@ class MarrHildreth(OpenCVHasher):
|
|
|
24
24
|
|
|
25
25
|
def __init__(self):
|
|
26
26
|
super().__init__()
|
|
27
|
-
self.hasher = cv2.img_hash.MarrHildrethHash.create()
|
|
27
|
+
self.hasher = cv2.img_hash.MarrHildrethHash.create() # type: ignore[attr-defined]
|
|
28
28
|
|
|
29
29
|
def _compute(self, image):
|
|
30
30
|
return np.unpackbits(self.hasher.compute(image)[0])
|
|
@@ -40,7 +40,7 @@ class ColorMoment(OpenCVHasher):
|
|
|
40
40
|
|
|
41
41
|
def __init__(self):
|
|
42
42
|
super().__init__()
|
|
43
|
-
self.hasher = cv2.img_hash.ColorMomentHash.create()
|
|
43
|
+
self.hasher = cv2.img_hash.ColorMomentHash.create() # type: ignore[attr-defined]
|
|
44
44
|
|
|
45
45
|
def _compute(self, image):
|
|
46
46
|
return 10000 * self.hasher.compute(image)[0]
|
|
@@ -56,7 +56,7 @@ class BlockMean(OpenCVHasher):
|
|
|
56
56
|
|
|
57
57
|
def __init__(self):
|
|
58
58
|
super().__init__()
|
|
59
|
-
self.hasher = cv2.img_hash.BlockMeanHash.create(1)
|
|
59
|
+
self.hasher = cv2.img_hash.BlockMeanHash.create(1) # type: ignore[attr-defined]
|
|
60
60
|
|
|
61
61
|
def _compute(self, image):
|
|
62
62
|
# https://stackoverflow.com/questions/54762896/why-cv2-norm-hamming-gives-different-value-than-actual-hamming-distance
|
|
@@ -11,6 +11,7 @@ import os
|
|
|
11
11
|
import queue
|
|
12
12
|
import shlex
|
|
13
13
|
import subprocess
|
|
14
|
+
import tempfile
|
|
14
15
|
import threading
|
|
15
16
|
import typing
|
|
16
17
|
import warnings
|
|
@@ -27,7 +28,9 @@ import validators
|
|
|
27
28
|
|
|
28
29
|
LOGGER = logging.getLogger(__name__)
|
|
29
30
|
|
|
30
|
-
ImageInputType = typing.Union[
|
|
31
|
+
ImageInputType = typing.Union[
|
|
32
|
+
str, np.ndarray, "PIL.Image.Image", io.BytesIO, tempfile.SpooledTemporaryFile
|
|
33
|
+
]
|
|
31
34
|
|
|
32
35
|
SIZES = {"float32": 32, "uint8": 8, "bool": 1}
|
|
33
36
|
|
|
@@ -357,7 +360,10 @@ def read(filepath_or_buffer: ImageInputType, timeout=None) -> np.ndarray:
|
|
|
357
360
|
"""
|
|
358
361
|
if isinstance(filepath_or_buffer, PIL.Image.Image):
|
|
359
362
|
return np.array(filepath_or_buffer.convert("RGB"))
|
|
360
|
-
if isinstance(
|
|
363
|
+
if isinstance(
|
|
364
|
+
filepath_or_buffer,
|
|
365
|
+
(io.BytesIO, client.HTTPResponse, tempfile.SpooledTemporaryFile),
|
|
366
|
+
):
|
|
361
367
|
image = np.asarray(bytearray(filepath_or_buffer.read()), dtype=np.uint8)
|
|
362
368
|
decoded_image = cv2.imdecode(image, cv2.IMREAD_UNCHANGED)
|
|
363
369
|
elif isinstance(filepath_or_buffer, str):
|
|
@@ -561,9 +567,7 @@ def read_video_to_generator_ffmpeg(
|
|
|
561
567
|
), f"Invalid framerate: {frames_per_second}"
|
|
562
568
|
seconds_per_frame = 1 / frames_per_second
|
|
563
569
|
filters.append(
|
|
564
|
-
f"fps={frames_per_second}:"
|
|
565
|
-
f"round={frame_rounding}:"
|
|
566
|
-
f"start_time={offset}"
|
|
570
|
+
f"fps={frames_per_second}:round={frame_rounding}:start_time={offset}"
|
|
567
571
|
)
|
|
568
572
|
# Add resizing filters.
|
|
569
573
|
if use_cuda and codec_name in CUDA_CODECS:
|
|
@@ -601,7 +605,12 @@ def read_video_to_generator_ffmpeg(
|
|
|
601
605
|
if not batch:
|
|
602
606
|
break
|
|
603
607
|
for image in np.frombuffer(batch, dtype="uint8").reshape(
|
|
604
|
-
(
|
|
608
|
+
(
|
|
609
|
+
-1,
|
|
610
|
+
height,
|
|
611
|
+
width,
|
|
612
|
+
channels,
|
|
613
|
+
)
|
|
605
614
|
):
|
|
606
615
|
if frames_per_second != "keyframes":
|
|
607
616
|
yield (image, frame_index, timestamp)
|
|
@@ -960,137 +969,206 @@ def compute_synchronized_video_hashes(
|
|
|
960
969
|
|
|
961
970
|
|
|
962
971
|
def unletterbox(
|
|
963
|
-
image
|
|
972
|
+
image: np.ndarray,
|
|
973
|
+
only_remove_black: bool = False,
|
|
974
|
+
min_fraction_meaningful_pixels: float = 0.1,
|
|
975
|
+
color_threshold: float = 2,
|
|
976
|
+
min_side_length: int = 50,
|
|
977
|
+
min_reduction: float = 0.02,
|
|
964
978
|
) -> tuple[tuple[int, int], tuple[int, int]] | None:
|
|
965
|
-
"""Return bounds of non-trivial region of image or None.
|
|
966
|
-
|
|
967
|
-
|
|
968
|
-
|
|
969
|
-
|
|
970
|
-
|
|
971
|
-
|
|
972
|
-
|
|
973
|
-
|
|
974
|
-
|
|
975
|
-
|
|
976
|
-
|
|
977
|
-
|
|
978
|
-
|
|
979
|
-
|
|
980
|
-
|
|
981
|
-
|
|
982
|
-
|
|
983
|
-
|
|
984
|
-
|
|
985
|
-
|
|
986
|
-
|
|
987
|
-
|
|
988
|
-
|
|
989
|
-
|
|
990
|
-
|
|
991
|
-
|
|
992
|
-
|
|
993
|
-
|
|
994
|
-
|
|
995
|
-
|
|
996
|
-
|
|
997
|
-
|
|
998
|
-
|
|
999
|
-
rows all 1s. `unletterbox(U1)` returns the bounds of the bottom
|
|
1000
|
-
two rows.
|
|
979
|
+
"""Return bounds of the non-trivial (content) region of an image, or None.
|
|
980
|
+
|
|
981
|
+
Letterboxing refers to uniform-color borders added around an image
|
|
982
|
+
(e.g., black bars on a video frame). This function detects such borders
|
|
983
|
+
by identifying the background color from the image corners and finding
|
|
984
|
+
the bounding box of pixels that differ from that background.
|
|
985
|
+
|
|
986
|
+
The function returns bounds as ``(x1, x2), (y1, y2)`` suitable for
|
|
987
|
+
slicing: ``image[y1:y2, x1:x2]``. The bounds are exclusive on the
|
|
988
|
+
right/bottom (i.e., x2 and y2 point one past the last content pixel).
|
|
989
|
+
|
|
990
|
+
**Algorithm overview:**
|
|
991
|
+
|
|
992
|
+
1. Sample the four corner pixels and find the most common value as
|
|
993
|
+
the candidate background color. If all four corners differ, return
|
|
994
|
+
``None`` (no consistent letterbox detected).
|
|
995
|
+
2. Build a binary content mask where each pixel whose grayscale
|
|
996
|
+
intensity differs from the background by more than
|
|
997
|
+
``color_threshold`` is marked as content.
|
|
998
|
+
3. Project the mask onto rows and columns and find the first/last
|
|
999
|
+
row and column where the fraction of content pixels exceeds
|
|
1000
|
+
``min_fraction_meaningful_pixels``.
|
|
1001
|
+
4. Validate that the resulting crop is meaningfully smaller than the
|
|
1002
|
+
original (controlled by ``min_reduction``) and that both sides
|
|
1003
|
+
exceed ``min_side_length``.
|
|
1004
|
+
|
|
1005
|
+
Returns ``None`` when:
|
|
1006
|
+
|
|
1007
|
+
- No two corners share the same color (no clear background).
|
|
1008
|
+
- Every pixel differs from the detected background (no border).
|
|
1009
|
+
- No row or column meets the content-pixel threshold.
|
|
1010
|
+
- The crop would not remove at least ``min_reduction`` fraction
|
|
1011
|
+
from any dimension.
|
|
1012
|
+
- Either cropped dimension would be smaller than ``min_side_length``.
|
|
1001
1013
|
|
|
1002
1014
|
Args:
|
|
1003
|
-
image:
|
|
1004
|
-
|
|
1005
|
-
|
|
1006
|
-
|
|
1007
|
-
|
|
1015
|
+
image: Input image as an ``np.ndarray``. May be grayscale (H×W)
|
|
1016
|
+
or RGB (H×W×3); RGB images are converted to grayscale
|
|
1017
|
+
internally for background detection.
|
|
1018
|
+
only_remove_black: If ``True``, treat black (intensity 0) as the
|
|
1019
|
+
background regardless of corner colors. If ``False`` (default),
|
|
1020
|
+
infer the background color from the most common corner value.
|
|
1021
|
+
min_fraction_meaningful_pixels: The minimum fraction (0–1) of
|
|
1022
|
+
pixels in a row or column that must differ from the background
|
|
1023
|
+
for that row/column to be considered part of the content region.
|
|
1024
|
+
Defaults to 0.1 (10%).
|
|
1025
|
+
color_threshold: The minimum absolute difference in grayscale
|
|
1026
|
+
intensity between a pixel and the background color for that
|
|
1027
|
+
pixel to be classified as content. Defaults to 2.
|
|
1028
|
+
min_side_length: The minimum width or height (in pixels) of the
|
|
1029
|
+
cropped region. If the crop would be smaller, ``None`` is
|
|
1030
|
+
returned. Defaults to 50.
|
|
1031
|
+
min_reduction: The minimum fraction (0–1) of the original width
|
|
1032
|
+
or height that must be removed for the crop to be worthwhile.
|
|
1033
|
+
If the crop removes less than this from both dimensions,
|
|
1034
|
+
``None`` is returned. Defaults to 0.02 (2%).
|
|
1008
1035
|
|
|
1009
1036
|
Returns:
|
|
1010
|
-
A
|
|
1011
|
-
and
|
|
1012
|
-
|
|
1013
|
-
|
|
1037
|
+
A tuple ``((x1, x2), (y1, y2))`` giving the left, right, top,
|
|
1038
|
+
and bottom bounds of the content region (right/bottom exclusive),
|
|
1039
|
+
or ``None`` if no meaningful letterbox was detected.
|
|
1014
1040
|
"""
|
|
1015
|
-
|
|
1016
|
-
|
|
1017
|
-
|
|
1018
|
-
|
|
1019
|
-
|
|
1020
|
-
|
|
1021
|
-
|
|
1022
|
-
|
|
1023
|
-
|
|
1024
|
-
|
|
1025
|
-
|
|
1026
|
-
|
|
1027
|
-
|
|
1028
|
-
|
|
1029
|
-
|
|
1030
|
-
|
|
1031
|
-
|
|
1041
|
+
if not 0 <= min_fraction_meaningful_pixels <= 1:
|
|
1042
|
+
raise ValueError("min_fraction_meaningful_pixels must be between 0 and 1")
|
|
1043
|
+
if not 0 <= min_reduction <= 1:
|
|
1044
|
+
raise ValueError("min_reduction must be between 0 and 1")
|
|
1045
|
+
image = image.astype(np.uint8)
|
|
1046
|
+
|
|
1047
|
+
shape = image.shape
|
|
1048
|
+
h, w = shape[0:2]
|
|
1049
|
+
if len(shape) == 3:
|
|
1050
|
+
image = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
|
|
1051
|
+
|
|
1052
|
+
# Determine background color and build binary content mask.
|
|
1053
|
+
if only_remove_black:
|
|
1054
|
+
bg_gray = 0
|
|
1055
|
+
else:
|
|
1056
|
+
# Sample the four corner pixels. If all four are unique there is no
|
|
1057
|
+
# consistent background color, so we bail out early (O(1) rejection).
|
|
1058
|
+
corners = (
|
|
1059
|
+
image[0, 0],
|
|
1060
|
+
image[0, w - 1],
|
|
1061
|
+
image[h - 1, 0],
|
|
1062
|
+
image[h - 1, w - 1],
|
|
1032
1063
|
)
|
|
1033
|
-
if len(counts) == 4:
|
|
1034
|
-
return (0, image.shape[1]), (0, image.shape[0])
|
|
1035
|
-
|
|
1036
|
-
# Grab reference color.
|
|
1037
|
-
# We grab the most common shared color.
|
|
1038
|
-
bg_color, _ = counts.most_common(1)[0]
|
|
1039
|
-
|
|
1040
|
-
# Create an image of just that color. dtype to match image.
|
|
1041
|
-
mask = np.ones((height, width, colors), dtype=np.int16)
|
|
1042
|
-
mask[:, :] = np.array(bg_color)
|
|
1043
|
-
|
|
1044
|
-
# Diff the image so that color is black.
|
|
1045
|
-
image = np.abs(np.subtract(image, mask))
|
|
1046
|
-
|
|
1047
|
-
# adj should be thought of as a boolean at each pixel indicating
|
|
1048
|
-
# whether or not that pixel is non-trivial (True) or not (False).
|
|
1049
|
-
adj = image.mean(axis=2) > 2
|
|
1050
1064
|
|
|
1051
|
-
|
|
1052
|
-
|
|
1065
|
+
if len(set(corners)) == 4:
|
|
1066
|
+
LOGGER.debug("No common corner color detected, skipping content detection.")
|
|
1067
|
+
return (
|
|
1068
|
+
(0, w),
|
|
1069
|
+
(0, h),
|
|
1070
|
+
) # Return full image bounds instead of None to maintain backwards compatibility
|
|
1071
|
+
# Use the most common corner value as the background intensity.
|
|
1072
|
+
counts = Counter(corners)
|
|
1073
|
+
bg_gray = counts.most_common(1)[0][0]
|
|
1074
|
+
|
|
1075
|
+
# Mark pixels whose grayscale intensity differs from the background
|
|
1076
|
+
# by more than color_threshold as content (True).
|
|
1077
|
+
content_mask = np.abs(image.astype(np.int16) - bg_gray) > color_threshold
|
|
1078
|
+
|
|
1079
|
+
# If every pixel is classified as content, there is no border to remove.
|
|
1080
|
+
if content_mask.all():
|
|
1081
|
+
LOGGER.debug("All pixels differ from background; no letterbox detected.")
|
|
1082
|
+
return (
|
|
1083
|
+
(0, w),
|
|
1084
|
+
(0, h),
|
|
1085
|
+
) # Return full image bounds instead of None to maintain backwards compatibility
|
|
1086
|
+
|
|
1087
|
+
# Find the content bounding box by projecting the mask onto rows and
|
|
1088
|
+
# columns. cv2.reduce is used instead of np.sum for performance.
|
|
1089
|
+
mask_u8 = content_mask.astype(np.uint8)
|
|
1090
|
+
row_content = cv2.reduce(mask_u8, 1, cv2.REDUCE_SUM, dtype=cv2.CV_32S).ravel()
|
|
1091
|
+
col_content = cv2.reduce(mask_u8, 0, cv2.REDUCE_SUM, dtype=cv2.CV_32S).ravel()
|
|
1092
|
+
|
|
1093
|
+
# Thresholds for minimum content per row/column
|
|
1094
|
+
row_threshold = min_fraction_meaningful_pixels * w
|
|
1095
|
+
col_threshold = min_fraction_meaningful_pixels * h
|
|
1096
|
+
|
|
1097
|
+
# Find first/last rows and columns with sufficient content
|
|
1098
|
+
content_rows = np.where(row_content > row_threshold)[0]
|
|
1099
|
+
content_cols = np.where(col_content > col_threshold)[0]
|
|
1100
|
+
|
|
1101
|
+
if len(content_rows) == 0 or len(content_cols) == 0:
|
|
1102
|
+
LOGGER.debug("No rows or columns with sufficient content detected.")
|
|
1103
|
+
return None
|
|
1053
1104
|
|
|
1054
|
-
|
|
1055
|
-
|
|
1056
|
-
|
|
1105
|
+
top = int(content_rows[0])
|
|
1106
|
+
bottom = int(content_rows[-1]) + 1
|
|
1107
|
+
left = int(content_cols[0])
|
|
1108
|
+
right = int(content_cols[-1]) + 1
|
|
1109
|
+
height = bottom - top
|
|
1110
|
+
width = right - left
|
|
1057
1111
|
|
|
1058
|
-
#
|
|
1059
|
-
|
|
1112
|
+
# Reject if the crop does not remove at least min_reduction from
|
|
1113
|
+
# at least one dimension (i.e., the border is negligibly thin).
|
|
1114
|
+
if width >= w * (1 - min_reduction) and height >= h * (1 - min_reduction):
|
|
1115
|
+
LOGGER.debug(
|
|
1116
|
+
"Crop would not reduce either dimension by %.0f%%; skipping.",
|
|
1117
|
+
min_reduction * 100,
|
|
1118
|
+
)
|
|
1119
|
+
return (
|
|
1120
|
+
(0, w),
|
|
1121
|
+
(0, h),
|
|
1122
|
+
) # Return full image bounds instead of None to maintain backwards compatibility
|
|
1123
|
+
# Reject if the remaining content region is too small to be useful.
|
|
1124
|
+
if width < min_side_length or height < min_side_length:
|
|
1125
|
+
LOGGER.debug(
|
|
1126
|
+
"Cropped region (%dx%d) smaller than min_side_length=%d; skipping.",
|
|
1127
|
+
width,
|
|
1128
|
+
height,
|
|
1129
|
+
min_side_length,
|
|
1130
|
+
)
|
|
1060
1131
|
return None
|
|
1061
1132
|
|
|
1062
|
-
|
|
1063
|
-
y1 = y2 = y[0]
|
|
1064
|
-
else:
|
|
1065
|
-
y1, y2 = y[[0, -1]]
|
|
1066
|
-
if len(x) == 1:
|
|
1067
|
-
x1 = x2 = x[0]
|
|
1068
|
-
else:
|
|
1069
|
-
x1, x2 = x[[0, -1]]
|
|
1070
|
-
bounds = (x1, x2 + 1), (y1, y2 + 1)
|
|
1071
|
-
|
|
1072
|
-
return bounds
|
|
1133
|
+
return ((left, right), (top, bottom))
|
|
1073
1134
|
|
|
1074
1135
|
|
|
1075
1136
|
def unletterbox_crop(
|
|
1076
|
-
image: np.ndarray,
|
|
1137
|
+
image: np.ndarray,
|
|
1138
|
+
min_fraction_meaningful_pixels: float = 0.1,
|
|
1139
|
+
color_threshold: float = 2,
|
|
1140
|
+
min_side_length: int = 50,
|
|
1141
|
+
min_reduction: float = 0.02,
|
|
1077
1142
|
) -> np.ndarray | None:
|
|
1078
1143
|
"""Detect and crop the letterboxed regions from an image.
|
|
1079
1144
|
|
|
1080
1145
|
Args:
|
|
1081
1146
|
image: The image from which to remove letterboxing.
|
|
1082
1147
|
min_fraction_meaningful_pixels: 0 to 1: if cropped version is
|
|
1083
|
-
|
|
1084
|
-
|
|
1148
|
+
smaller than this fraction of the image do not unletterbox.
|
|
1149
|
+
0.1 == 10% of the image.
|
|
1150
|
+
color_threshold: The minimum absolute difference in grayscale
|
|
1151
|
+
intensity between a pixel and the background color for that
|
|
1152
|
+
pixel to be classified as content. Defaults to 2.
|
|
1153
|
+
min_side_length: The minimum width or height (in pixels) of the
|
|
1154
|
+
cropped region. If the crop would be smaller, ``None`` is
|
|
1155
|
+
returned. Defaults to 50.
|
|
1156
|
+
min_reduction: The minimum fraction (0–1) of the original width
|
|
1157
|
+
or height that must be removed for the crop to be worthwhile.
|
|
1158
|
+
If the crop removes less than this from both dimensions,
|
|
1159
|
+
the original image is returned. Defaults to 0.02 (2%).
|
|
1085
1160
|
Returns:
|
|
1086
1161
|
The cropped image or None if the image is mostly blank space.
|
|
1087
1162
|
"""
|
|
1088
|
-
|
|
1089
|
-
|
|
1090
|
-
), "Please send np.ndarray to unletterbox_image()."
|
|
1163
|
+
if not isinstance(image, np.ndarray):
|
|
1164
|
+
raise TypeError(f"Expected np.ndarray, got {type(image).__name__}")
|
|
1091
1165
|
|
|
1092
1166
|
bounds = unletterbox(
|
|
1093
|
-
image,
|
|
1167
|
+
image,
|
|
1168
|
+
min_fraction_meaningful_pixels=min_fraction_meaningful_pixels,
|
|
1169
|
+
color_threshold=color_threshold,
|
|
1170
|
+
min_side_length=min_side_length,
|
|
1171
|
+
min_reduction=min_reduction,
|
|
1094
1172
|
)
|
|
1095
1173
|
if bounds is None:
|
|
1096
1174
|
return None
|
|
@@ -56,7 +56,7 @@ class TMKL2(VideoHasher):
|
|
|
56
56
|
for i in range(1, self.m)
|
|
57
57
|
]
|
|
58
58
|
)
|
|
59
|
-
a = a.reshape(1, -1).repeat(repeats=len(self.T), axis=0)
|
|
59
|
+
a = a.reshape(1, -1).repeat(repeats=len(self.T), axis=0) # type: ignore
|
|
60
60
|
a = np.sqrt(a)
|
|
61
61
|
self.a = a[..., np.newaxis]
|
|
62
62
|
|
|
@@ -77,7 +77,12 @@ class TMKL2(VideoHasher):
|
|
|
77
77
|
def hash_from_final_state(self, state):
|
|
78
78
|
timestamps = np.array(state["timestamps"])
|
|
79
79
|
features = np.array(state["features"]).reshape(
|
|
80
|
-
(
|
|
80
|
+
(
|
|
81
|
+
1,
|
|
82
|
+
1,
|
|
83
|
+
timestamps.shape[0],
|
|
84
|
+
self.frame_hasher.hash_length,
|
|
85
|
+
)
|
|
81
86
|
)
|
|
82
87
|
x = self.ms_normed * timestamps
|
|
83
88
|
yw1 = np.sin(x) * self.a
|
|
@@ -101,8 +101,8 @@ def hash_dicts_to_df(hash_dicts, returns_multiple):
|
|
|
101
101
|
),
|
|
102
102
|
"hash": tools.flatten([h["hash"] for h in hash_dicts]),
|
|
103
103
|
}
|
|
104
|
-
).assign(error=
|
|
105
|
-
return pd.DataFrame.from_records(hash_dicts).assign(error=
|
|
104
|
+
).assign(error=np.nan)
|
|
105
|
+
return pd.DataFrame.from_records(hash_dicts).assign(error=np.nan)
|
|
106
106
|
|
|
107
107
|
|
|
108
108
|
def test_hasher_parallelization(hasher, test_filepaths):
|
|
@@ -156,7 +156,9 @@ def test_image_hasher_integrity(
|
|
|
156
156
|
image2 = test_images[1]
|
|
157
157
|
hash1_1 = str(hasher.compute(image1)) # str() games for mypy, not proud
|
|
158
158
|
hash1_2 = str(hasher.compute(Image.open(image1)))
|
|
159
|
-
|
|
159
|
+
image_cv = cv2.imread(image1)
|
|
160
|
+
assert image_cv is not None, f"Failed to load image: {image1}"
|
|
161
|
+
hash1_3 = str(hasher.compute(cv2.cvtColor(image_cv, cv2.COLOR_BGR2RGB)))
|
|
160
162
|
|
|
161
163
|
hash2_1 = str(hasher.compute(image2))
|
|
162
164
|
|
|
@@ -1,55 +1,46 @@
|
|
|
1
|
-
[
|
|
1
|
+
[project]
|
|
2
2
|
name = "Perception"
|
|
3
|
-
|
|
3
|
+
dynamic = []
|
|
4
4
|
description = "Perception provides flexible, well-documented, and comprehensively tested tooling for perceptual hashing research, development, and production use."
|
|
5
|
-
authors = ["Thorn
|
|
6
|
-
license = "Apache
|
|
5
|
+
authors = [{ name = "Thorn", email = "info@wearethorn.org" }]
|
|
6
|
+
license = "Apache-2.0"
|
|
7
7
|
readme = "README.md"
|
|
8
|
+
requires-python = ">=3.10,<4.0"
|
|
9
|
+
dependencies = [
|
|
10
|
+
"Cython>=3.0.0,<4.0.0",
|
|
11
|
+
"numpy>=1.26.4,<3.0.0",
|
|
12
|
+
"opencv-contrib-python-headless>=4.10.0,<5.0.0",
|
|
13
|
+
"faiss-cpu>=1.8.0,<2.0.0",
|
|
14
|
+
"networkit>=11.1,<12.0.0; sys_platform != 'darwin'",
|
|
15
|
+
"networkx>=3.0,<4.0; sys_platform == 'darwin'",
|
|
16
|
+
"pandas",
|
|
17
|
+
"Pillow",
|
|
18
|
+
"pywavelets>=1.5.0,<2.0.0",
|
|
19
|
+
"validators>=0.22.0,<1.0.0",
|
|
20
|
+
"rich>=13.7.0,<14.0.0",
|
|
21
|
+
"scipy",
|
|
22
|
+
"tqdm>=4.67.1,<5.0.0",
|
|
23
|
+
]
|
|
24
|
+
version = "0.8.4"
|
|
8
25
|
|
|
9
|
-
[tool.poetry.dependencies]
|
|
10
|
-
python = "^3.10"
|
|
11
|
-
Cython = "^3"
|
|
12
|
-
numpy = "^1.26"
|
|
13
|
-
opencv-contrib-python-headless = "^4.10"
|
|
14
|
-
pandas = "*"
|
|
15
|
-
pdqhash = "*"
|
|
16
|
-
Pillow = "*"
|
|
17
|
-
pywavelets = "^1.5.0"
|
|
18
|
-
tqdm = "*"
|
|
19
|
-
validators = ">=0.22, <1.0"
|
|
20
|
-
scipy = "*"
|
|
21
|
-
|
|
22
|
-
# Benchmarking Extras
|
|
23
|
-
matplotlib = { version = "*", optional = true }
|
|
24
|
-
imgaug = { version = "*", optional = true }
|
|
25
|
-
tabulate = { version = "*", optional = true }
|
|
26
|
-
scikit-learn = { version = "*", optional = true }
|
|
27
|
-
ffmpeg-python = { version = "*", optional = true }
|
|
28
|
-
|
|
29
|
-
# Matching Extras
|
|
30
|
-
aiohttp = { version = "*", optional = true }
|
|
31
|
-
python-json-logger = { version = "*", optional = true }
|
|
32
|
-
rich = "^13.7.0"
|
|
33
|
-
|
|
34
|
-
# Experimental Extras
|
|
35
|
-
networkit = { version = "^11", optional = true }
|
|
36
|
-
faiss-cpu = { version = "^1.8.0.post1", optional = true }
|
|
37
26
|
|
|
38
|
-
[
|
|
27
|
+
[project.optional-dependencies]
|
|
39
28
|
benchmarking = [
|
|
40
29
|
"matplotlib",
|
|
41
|
-
"
|
|
42
|
-
"imgaug",
|
|
30
|
+
"albumentations>=2.0.8,<3.0.0",
|
|
43
31
|
"tabulate",
|
|
44
32
|
"scikit-learn",
|
|
45
33
|
"ffmpeg-python",
|
|
46
34
|
]
|
|
47
35
|
matching = ["aiohttp", "python-json-logger"]
|
|
48
|
-
|
|
36
|
+
pdq = ["pdqhash>=0.2.7,<0.3.0"]
|
|
37
|
+
|
|
38
|
+
|
|
39
|
+
[tool.poetry]
|
|
49
40
|
|
|
50
41
|
|
|
51
42
|
[tool.poetry.group.dev.dependencies]
|
|
52
|
-
black = "^
|
|
43
|
+
black = "^26"
|
|
53
44
|
coverage = "*"
|
|
54
45
|
ipython = "*"
|
|
55
46
|
mypy = "*"
|
|
@@ -61,6 +52,7 @@ ruff = "*"
|
|
|
61
52
|
types-pillow = "*"
|
|
62
53
|
types-tqdm = "*"
|
|
63
54
|
twine = "*"
|
|
55
|
+
albumentations = "^2.0.8"
|
|
64
56
|
|
|
65
57
|
|
|
66
58
|
[tool.poetry.build]
|
|
@@ -74,6 +66,7 @@ ignore_missing_imports = true
|
|
|
74
66
|
|
|
75
67
|
[tool.poetry-dynamic-versioning]
|
|
76
68
|
enable = false
|
|
69
|
+
vcs = "git"
|
|
77
70
|
|
|
78
71
|
[build-system]
|
|
79
72
|
requires = [
|
|
@@ -14,30 +14,32 @@ package_data = \
|
|
|
14
14
|
{'': ['*'], 'perception.testing': ['images/*', 'logos/*', 'videos/*']}
|
|
15
15
|
|
|
16
16
|
install_requires = \
|
|
17
|
-
['Cython>=3,<4',
|
|
17
|
+
['Cython>=3.0.0,<4.0.0',
|
|
18
18
|
'Pillow',
|
|
19
|
-
'
|
|
20
|
-
'
|
|
19
|
+
'faiss-cpu>=1.8.0,<2.0.0',
|
|
20
|
+
'numpy>=1.26.4,<3.0.0',
|
|
21
|
+
'opencv-contrib-python-headless>=4.10.0,<5.0.0',
|
|
21
22
|
'pandas',
|
|
22
|
-
'pdqhash',
|
|
23
23
|
'pywavelets>=1.5.0,<2.0.0',
|
|
24
24
|
'rich>=13.7.0,<14.0.0',
|
|
25
|
-
'
|
|
26
|
-
'
|
|
25
|
+
'scipy',
|
|
26
|
+
'tqdm>=4.67.1,<5.0.0',
|
|
27
|
+
'validators>=0.22.0,<1.0.0']
|
|
27
28
|
|
|
28
29
|
extras_require = \
|
|
29
|
-
{':
|
|
30
|
+
{':sys_platform != "darwin"': ['networkit>=11.1,<12.0.0'],
|
|
31
|
+
':sys_platform == "darwin"': ['networkx>=3.0,<4.0'],
|
|
30
32
|
'benchmarking': ['matplotlib',
|
|
31
|
-
'
|
|
33
|
+
'albumentations>=2.0.8,<3.0.0',
|
|
32
34
|
'tabulate',
|
|
33
35
|
'scikit-learn',
|
|
34
36
|
'ffmpeg-python'],
|
|
35
|
-
'
|
|
36
|
-
'
|
|
37
|
+
'matching': ['aiohttp', 'python-json-logger'],
|
|
38
|
+
'pdq': ['pdqhash>=0.2.7,<0.3.0']}
|
|
37
39
|
|
|
38
40
|
setup_kwargs = {
|
|
39
41
|
'name': 'Perception',
|
|
40
|
-
'version': '0.8.
|
|
42
|
+
'version': '0.8.4',
|
|
41
43
|
'description': 'Perception provides flexible, well-documented, and comprehensively tested tooling for perceptual hashing research, development, and production use.',
|
|
42
44
|
'long_description': "# perception \n\n`perception` provides flexible, well-documented, and comprehensively tested tooling for perceptual hashing research, development, and production use. See [the documentation](https://perception.thorn.engineering/en/latest/) for details.\n\n## Background\n\n`perception` was initially developed at [Thorn](https://www.thorn.org) as part of our work to eliminate child sexual abuse material from the internet. For more information on the issue, check out [our CEO's TED talk](https://www.thorn.org/blog/time-is-now-eliminate-csam/).\n\n## Getting Started\n\n### Installation\n\n`pip install perception`\n\n### Hashing\n\nHashing with different functions is simple with `perception`.\n\n```python\nfrom perception import hashers\n\nfile1, file2 = 'test1.jpg', 'test2.jpg'\nhasher = hashers.PHash()\nhash1, hash2 = hasher.compute(file1), hasher.compute(file2)\ndistance = hasher.compute_distance(hash1, hash2)\n```\n\n### Examples\n\nSee below for end-to-end examples for common use cases for perceptual hashes.\n\n- [Detecting child sexual abuse material](https://perception.thorn.engineering/en/latest/examples/detecting_csam.html)\n- [Deduplicating media](https://perception.thorn.engineering/en/latest/examples/deduplication.html)\n- [Benchmarking perceptual hashes](https://perception.thorn.engineering/en/latest/examples/benchmarking.html)\n\n## Supported Hashing Algorithms\n\n`perception` currently ships with:\n\n- pHash (DCT hash) (`perception.hashers.PHash`)\n- Facebook's PDQ Hash (`perception.hashers.PDQ`)\n- dHash (difference hash) (`perception.hashers.DHash`)\n- aHash (average hash) (`perception.hashers.AverageHash`)\n- Marr-Hildreth (`perception.hashers.MarrHildreth`)\n- Color Moment (`perception.hashers.ColorMoment`)\n- Block Mean (`perception.hashers.BlockMean`)\n- wHash (wavelet hash) (`perception.hashers.WaveletHash`)\n\n## Contributing\n\nTo work on the project, start by doing the following.\n\n```bash\n# Install local dependencies for\n# code completion, etc.\nmake init\n\n- To do a (close to) comprehensive check before committing code, you can use `make precommit`.\n\nTo implement new features, please first file an issue proposing your change for discussion.\n\nTo report problems, please file an issue with sample code, expected results, actual results, and a complete traceback.\n\n## Alternatives\n\nThere are other packages worth checking out to see if they meet your needs for perceptual hashing. Here are some\nexamples.\n\n- [dedupe](https://github.com/dedupeio/dedupe)\n- [imagededup](https://idealo.github.io/imagededup/)\n- [ImageHash](https://github.com/JohannesBuchner/imagehash)\n- [PhotoHash](https://github.com/bunchesofdonald/photohash)\n```\n",
|
|
43
45
|
'author': 'Thorn',
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|