dhash-vips 0.0.5.0 → 0.0.5.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +39 -20
- data/Rakefile +1 -1
- data/dhash-vips.gemspec +1 -1
- data/lib/dhash-vips.rb +6 -7
- metadata +16 -16
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 73ec3cad8fd74c5eef699cb7c2251077a667c6c0
|
4
|
+
data.tar.gz: 925b66f53a63671a6b6985a37e7224d9e1b399dc
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 8058297adddc0c433c4fce7f1a6e61a345f83801f3e6a739f90231bf289a02665b799e95e997cd3e3c3af327474c2ced3f341770b79423ef9018eb0411998f4a
|
7
|
+
data.tar.gz: 0466d9ee38940e67e768336c9f216710f9e721d836cb92901a218b506f7bfc4e109e718684bd3097350530eab31e34cca234bc1ab0a168f323c9490525086506
|
data/README.md
CHANGED
@@ -11,15 +11,15 @@ There were several implementations on Github already but they all depend on Imag
|
|
11
11
|
```
|
12
12
|
load and calculate the fingerprint:
|
13
13
|
user system total real
|
14
|
-
Dhash
|
15
|
-
DHashVips::DHash 1.
|
16
|
-
DHashVips::IDHash 1.
|
14
|
+
Dhash 13.110000 0.950000 14.060000 ( 14.537057)
|
15
|
+
DHashVips::DHash 1.480000 0.310000 1.790000 ( 1.808787)
|
16
|
+
DHashVips::IDHash 1.080000 0.100000 1.180000 ( 1.156446)
|
17
17
|
|
18
18
|
measure the distance (1000 times):
|
19
|
-
|
20
|
-
Dhash hamming
|
21
|
-
DHashVips::DHash
|
22
|
-
DHashVips::IDHash
|
19
|
+
user system total real
|
20
|
+
Dhash hamming 1.770000 0.010000 1.780000 ( 1.815612)
|
21
|
+
DHashVips::DHash 1.810000 0.010000 1.820000 ( 1.875666)
|
22
|
+
DHashVips::IDHash 3.430000 0.020000 3.450000 ( 3.499031)
|
23
23
|
```
|
24
24
|
|
25
25
|
Here the `Dhash` is [another gem](https://github.com/maccman/dhash) that I used earlier in my projects.
|
@@ -36,7 +36,7 @@ It has improvements over the dHash that made fingerprinting less sensitive to th
|
|
36
36
|
* It subtracts not only horizontally but also vertically -- that adds 128 more bits.
|
37
37
|
* Instead of resizing to 9x8 it resizes to 8x8 and puts the image on a torus so it subtracts the left column from the right one and the top from bottom.
|
38
38
|
|
39
|
-
You could see in fingerprint calculation benchmark earlier that these improvements didn't make it slower than dHash because most of the time is spent on image resizing. The calculation of distance is what became two times slower:
|
39
|
+
You could see in fingerprint calculation benchmark earlier that these improvements didn't make it slower than dHash because most of the time is spent on image resizing (at some point it actually even became faster, idk why). The calculation of distance is what became two times slower:
|
40
40
|
```ruby
|
41
41
|
((a | b) & ((a ^ b) >> 128)).to_s(2).count "1"
|
42
42
|
```
|
@@ -102,11 +102,11 @@ end
|
|
102
102
|
```ruby
|
103
103
|
require "dhash-vips"
|
104
104
|
|
105
|
-
hash1 = DHashVips::IDHash.
|
106
|
-
hash2 = DHashVips::IDHash.
|
105
|
+
hash1 = DHashVips::IDHash.fingerprint "photo1.jpg"
|
106
|
+
hash2 = DHashVips::IDHash.fingerprint "photo2.jpg"
|
107
107
|
|
108
108
|
distance = DHashVips::IDHash.distance hash1, hash2
|
109
|
-
if distance <
|
109
|
+
if distance < 15
|
110
110
|
puts "Images are very similar"
|
111
111
|
elsif distance < 25
|
112
112
|
puts "Images are slightly similar"
|
@@ -115,7 +115,7 @@ else
|
|
115
115
|
end
|
116
116
|
```
|
117
117
|
|
118
|
-
These `
|
118
|
+
These `15` and `25` numbers are found empirically and just work enough well for 8-byte hashes.
|
119
119
|
To find out these tresholds we can run a rake task with hardcoded test cases:
|
120
120
|
```
|
121
121
|
$ rake compare_matrices
|
@@ -149,10 +149,10 @@ Different images: 102..211
|
|
149
149
|
### Notes
|
150
150
|
|
151
151
|
* Methods were renamed from `#calculate` to `#fingerprint` and from `#hamming` to `#distance`.
|
152
|
-
* The `DHash#calculate` accepts `hash_size` optional parameter that is 8 by default. The `IDHash#fingerprint`'s optional parameter is called `power` and works in a bit different way: 3 means 8 and 4 means 16 -- other sizes are not supported because they don't seem to be useful (higher fingerprint resolution makes it vulnerable to image shifts and croppings). Because IDHash's fingerprint is more complex than DHash's one it's not that straight forward to compare them so under the hood the `#distance` methods have to check the size of fingerprint -- this trade-off costs 30-40% of speed that can be eliminated by using `#distance3` method that assumes fingerprint to be of power=3. So the full benchmark is this one:
|
152
|
+
* The `DHash#calculate` accepts `hash_size` optional parameter that is 8 by default. The `IDHash#fingerprint`'s optional parameter is called `power` and works in a bit different way: 3 means 8 and 4 means 16 -- other sizes are not supported because they don't seem to be useful (higher fingerprint resolution makes it vulnerable to image shifts and croppings, also `#distance` becomes much slower). Because IDHash's fingerprint is more complex than DHash's one it's not that straight forward to compare them so under the hood the `#distance` methods have to check the size of fingerprint -- this trade-off costs 30-40% of speed that can be eliminated by using `#distance3` method that assumes fingerprint to be of power=3. So the full benchmark is this one:
|
153
153
|
|
154
154
|
```
|
155
|
-
|
155
|
+
# Ruby 2.0.0
|
156
156
|
|
157
157
|
load and calculate the fingerprint:
|
158
158
|
user system total real
|
@@ -162,12 +162,31 @@ DHashVips::IDHash 1.060000 0.090000 1.150000 ( 1.100332)
|
|
162
162
|
DHashVips::IDHash 4 1.030000 0.080000 1.110000 ( 1.089148)
|
163
163
|
|
164
164
|
measure the distance (1000 times):
|
165
|
-
|
166
|
-
Dhash hamming
|
167
|
-
DHashVips::DHash hamming
|
168
|
-
DHashVips::IDHash distance
|
169
|
-
DHashVips::IDHash distance3
|
170
|
-
DHashVips::IDHash distance 4
|
165
|
+
user system total real
|
166
|
+
Dhash hamming 3.140000 0.020000 3.160000 ( 3.179392)
|
167
|
+
DHashVips::DHash hamming 3.040000 0.020000 3.060000 ( 3.095190)
|
168
|
+
DHashVips::IDHash distance 8.170000 0.040000 8.210000 ( 8.279950)
|
169
|
+
DHashVips::IDHash distance3 6.720000 0.030000 6.750000 ( 6.790900)
|
170
|
+
DHashVips::IDHash distance 4 24.430000 0.130000 24.560000 ( 24.652625)
|
171
|
+
```
|
172
|
+
(macOS system MRI 2.3 has some nice bit arithmetics improvement compared to 2.0)
|
173
|
+
```
|
174
|
+
# Ruby 2.3.3
|
175
|
+
|
176
|
+
load and calculate the fingerprint:
|
177
|
+
user system total real
|
178
|
+
Dhash 13.110000 0.950000 14.060000 ( 14.537057)
|
179
|
+
DHashVips::DHash 1.480000 0.310000 1.790000 ( 1.808787)
|
180
|
+
DHashVips::IDHash 1.080000 0.100000 1.180000 ( 1.156446)
|
181
|
+
DHashVips::IDHash 4 1.030000 0.090000 1.120000 ( 1.076117)
|
182
|
+
|
183
|
+
measure the distance (1000 times):
|
184
|
+
user system total real
|
185
|
+
Dhash hamming 1.770000 0.010000 1.780000 ( 1.815612)
|
186
|
+
DHashVips::DHash hamming 1.810000 0.010000 1.820000 ( 1.875666)
|
187
|
+
DHashVips::IDHash distance 4.250000 0.020000 4.270000 ( 4.350071)
|
188
|
+
DHashVips::IDHash distance3 3.430000 0.020000 3.450000 ( 3.499031)
|
189
|
+
DHashVips::IDHash distance 4 8.210000 0.110000 8.320000 ( 8.510735)
|
171
190
|
```
|
172
191
|
|
173
192
|
Also note that to make `#distance` able to assume the fingerprint resolution from the size of Integer that represents it, the change in its structure was needed (left half of bits was swapped with right one), so fingerprints between versions 0.0.4 and 0.0.5 became incompatible, but you probably can convert them manually. I know, incompatibilities suck but if we put the version or structure information inside fingerprint it will became slow to (de)serialize and store.
|
data/Rakefile
CHANGED
@@ -230,7 +230,7 @@ task :compare_speed do
|
|
230
230
|
end
|
231
231
|
hashes[-1, 1] = hashes[-2, 2] # for `distance` and `distance3` we use the same hashes
|
232
232
|
puts "\nmeasure the distance (1000 times):"
|
233
|
-
Benchmark.bm
|
233
|
+
Benchmark.bm 29 do |bm|
|
234
234
|
[
|
235
235
|
[Dhash, :hamming],
|
236
236
|
[DHashVips::DHash, :hamming],
|
data/dhash-vips.gemspec
CHANGED
data/lib/dhash-vips.rb
CHANGED
@@ -34,11 +34,10 @@ module DHashVips
|
|
34
34
|
end
|
35
35
|
def distance a, b
|
36
36
|
size_a, size_b = [a, b].map do |x|
|
37
|
-
|
38
|
-
|
39
|
-
|
40
|
-
|
41
|
-
end
|
37
|
+
# TODO write a test about possible hash sizes
|
38
|
+
# they were 32 and 128, 124, 120 for MRI 2.0
|
39
|
+
# but also 31, 30 happens for MRI 2.3
|
40
|
+
x.size <= 32 ? 8 : 16
|
42
41
|
end
|
43
42
|
fail "fingerprints were taken with different `power` param: #{size_a} and #{size_b}" if size_a != size_b
|
44
43
|
((a ^ b) & (a | b) >> 2 * size_a * size_a).to_s(2).count "1"
|
@@ -64,9 +63,9 @@ module DHashVips
|
|
64
63
|
fail unless 1 == @@median[[1, 1, 1]]
|
65
64
|
fail unless 1 == @@median[[1, 1]]
|
66
65
|
|
67
|
-
def fingerprint
|
66
|
+
def fingerprint filename, power = 3
|
68
67
|
size = 2 ** power
|
69
|
-
image = Vips::Image.new_from_file
|
68
|
+
image = Vips::Image.new_from_file filename
|
70
69
|
image = image.resize(size.fdiv(image.width), vscale: size.fdiv(image.height)).colourspace("b-w")
|
71
70
|
|
72
71
|
array = image.to_a.map &:flatten
|
metadata
CHANGED
@@ -1,83 +1,83 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: dhash-vips
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.5.
|
4
|
+
version: 0.0.5.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Victor Maslov
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2018-
|
11
|
+
date: 2018-03-05 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: ruby-vips
|
15
15
|
requirement: !ruby/object:Gem::Requirement
|
16
16
|
requirements:
|
17
|
-
- -
|
17
|
+
- - ">="
|
18
18
|
- !ruby/object:Gem::Version
|
19
19
|
version: '0'
|
20
20
|
type: :runtime
|
21
21
|
prerelease: false
|
22
22
|
version_requirements: !ruby/object:Gem::Requirement
|
23
23
|
requirements:
|
24
|
-
- -
|
24
|
+
- - ">="
|
25
25
|
- !ruby/object:Gem::Version
|
26
26
|
version: '0'
|
27
27
|
- !ruby/object:Gem::Dependency
|
28
28
|
name: rake
|
29
29
|
requirement: !ruby/object:Gem::Requirement
|
30
30
|
requirements:
|
31
|
-
- -
|
31
|
+
- - ">="
|
32
32
|
- !ruby/object:Gem::Version
|
33
33
|
version: '0'
|
34
34
|
type: :development
|
35
35
|
prerelease: false
|
36
36
|
version_requirements: !ruby/object:Gem::Requirement
|
37
37
|
requirements:
|
38
|
-
- -
|
38
|
+
- - ">="
|
39
39
|
- !ruby/object:Gem::Version
|
40
40
|
version: '0'
|
41
41
|
- !ruby/object:Gem::Dependency
|
42
42
|
name: rspec-core
|
43
43
|
requirement: !ruby/object:Gem::Requirement
|
44
44
|
requirements:
|
45
|
-
- -
|
45
|
+
- - ">="
|
46
46
|
- !ruby/object:Gem::Version
|
47
47
|
version: '0'
|
48
48
|
type: :development
|
49
49
|
prerelease: false
|
50
50
|
version_requirements: !ruby/object:Gem::Requirement
|
51
51
|
requirements:
|
52
|
-
- -
|
52
|
+
- - ">="
|
53
53
|
- !ruby/object:Gem::Version
|
54
54
|
version: '0'
|
55
55
|
- !ruby/object:Gem::Dependency
|
56
56
|
name: dhash
|
57
57
|
requirement: !ruby/object:Gem::Requirement
|
58
58
|
requirements:
|
59
|
-
- -
|
59
|
+
- - ">="
|
60
60
|
- !ruby/object:Gem::Version
|
61
61
|
version: '0'
|
62
62
|
type: :development
|
63
63
|
prerelease: false
|
64
64
|
version_requirements: !ruby/object:Gem::Requirement
|
65
65
|
requirements:
|
66
|
-
- -
|
66
|
+
- - ">="
|
67
67
|
- !ruby/object:Gem::Version
|
68
68
|
version: '0'
|
69
69
|
- !ruby/object:Gem::Dependency
|
70
70
|
name: get_process_mem
|
71
71
|
requirement: !ruby/object:Gem::Requirement
|
72
72
|
requirements:
|
73
|
-
- -
|
73
|
+
- - ">="
|
74
74
|
- !ruby/object:Gem::Version
|
75
75
|
version: '0'
|
76
76
|
type: :development
|
77
77
|
prerelease: false
|
78
78
|
version_requirements: !ruby/object:Gem::Requirement
|
79
79
|
requirements:
|
80
|
-
- -
|
80
|
+
- - ">="
|
81
81
|
- !ruby/object:Gem::Version
|
82
82
|
version: '0'
|
83
83
|
description:
|
@@ -86,7 +86,7 @@ executables: []
|
|
86
86
|
extensions: []
|
87
87
|
extra_rdoc_files: []
|
88
88
|
files:
|
89
|
-
- .gitignore
|
89
|
+
- ".gitignore"
|
90
90
|
- Gemfile
|
91
91
|
- LICENSE.txt
|
92
92
|
- README.md
|
@@ -104,17 +104,17 @@ require_paths:
|
|
104
104
|
- lib
|
105
105
|
required_ruby_version: !ruby/object:Gem::Requirement
|
106
106
|
requirements:
|
107
|
-
- -
|
107
|
+
- - ">="
|
108
108
|
- !ruby/object:Gem::Version
|
109
109
|
version: '0'
|
110
110
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
111
111
|
requirements:
|
112
|
-
- -
|
112
|
+
- - ">="
|
113
113
|
- !ruby/object:Gem::Version
|
114
114
|
version: '0'
|
115
115
|
requirements: []
|
116
116
|
rubyforge_project:
|
117
|
-
rubygems_version: 2.
|
117
|
+
rubygems_version: 2.5.2
|
118
118
|
signing_key:
|
119
119
|
specification_version: 4
|
120
120
|
summary: dHash and IDHash powered by Vips
|