measurable 0.0.4 → 0.0.5
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/Gemfile.lock +1 -1
- data/README.md +35 -16
- data/Rakefile +16 -3
- data/lib/measurable.rb +6 -37
- data/lib/measurable/cosine.rb +23 -6
- data/lib/measurable/euclidean.rb +71 -35
- data/lib/measurable/haversine.rb +60 -35
- data/lib/measurable/jaccard.rb +64 -21
- data/lib/measurable/maxmin.rb +30 -9
- data/lib/measurable/tanimoto.rb +29 -8
- data/lib/measurable/version.rb +1 -1
- data/spec/cosine_spec.rb +29 -0
- data/spec/euclidean_spec.rb +61 -0
- data/spec/haversine_spec.rb +37 -0
- data/spec/jaccard_spec.rb +62 -0
- data/spec/maxmin_spec.rb +30 -0
- data/spec/spec_helper.rb +2 -0
- data/spec/tanimoto_spec.rb +30 -0
- metadata +15 -5
- data/spec/measurable_spec.rb +0 -159
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 66337383a6c25685893bb39f7caf5c7f0b40bcff
|
4
|
+
data.tar.gz: 117a22bb28b1d36f14780d3a22bbad7211b279a0
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 0d7aff51213d2f0ca31472d1d5f43b3ce99d58255b9d3d53485b0983e953866a365927734c1852c7040f7c23149455712af6357679e03ff63bf247709ac7e124
|
7
|
+
data.tar.gz: 652066925d87d7d52656f87346c856d34296e4a98662e94a670d72e9612c568dd95bf758314c524501700b21b53484eaaf059e5b5fee315129fb1b0b90a170bd
|
data/Gemfile.lock
CHANGED
data/README.md
CHANGED
@@ -1,25 +1,32 @@
|
|
1
1
|
# Measurable
|
2
2
|
|
3
|
-
|
3
|
+
A gem to test what metric is best for certain kinds of datasets in machine learning.
|
4
4
|
|
5
|
-
|
5
|
+
Besides the `Array` class, I also want to support `NVector` (from [NMatrix](http://github.com/sciruby/nmatrix)).
|
6
6
|
|
7
|
-
|
7
|
+
The distance measures will be created in Ruby first. If I see that it's really too slow, I'll write some methods in C (or Java, for JRuby).
|
8
|
+
|
9
|
+
This is a fork of the gem [Distance Measure](https://github.com/reddavis/Distance-Measures), which has a similar objective, but isn't actively maintained and doesn't support NMatrix. Thank you, [@reddavis][reddavis]. :)
|
8
10
|
|
9
11
|
## Install
|
10
12
|
|
11
13
|
`gem install measurable`
|
12
14
|
|
13
|
-
|
15
|
+
I only tested it with 2.0.0 (yes, yes, travis, I'll do it eventually). I want to support JRuby as well.
|
16
|
+
|
17
|
+
## Distance measures
|
18
|
+
|
19
|
+
I'm using the term "distance measure" without much concern for the strict mathematical definition of a metric. If the documentation for one of the methods isn't clear about it being or not a metric, please open an issue.
|
14
20
|
|
15
|
-
|
21
|
+
The following are the similarity measures supported at the moment:
|
16
22
|
|
17
23
|
- Euclidean distance
|
18
24
|
- Squared euclidean distance
|
19
25
|
- Cosine distance
|
20
|
-
- Max-min distance (["K-Means clustering using max-min distance measure"][
|
26
|
+
- Max-min distance (from ["K-Means clustering using max-min distance measure"][maxmin])
|
21
27
|
- Jaccard distance
|
22
28
|
- Tanimoto distance
|
29
|
+
- Haversine distance
|
23
30
|
|
24
31
|
These still need to be implemented:
|
25
32
|
|
@@ -36,30 +43,42 @@ These still need to be implemented:
|
|
36
43
|
|
37
44
|
## How to use
|
38
45
|
|
39
|
-
This list will be updated as I have time. I'll refactor the existing measures and add some that I'll need in a project.
|
40
|
-
|
41
46
|
The API I intend to support is something like this:
|
42
47
|
|
43
48
|
```ruby
|
44
49
|
require "measurable"
|
45
|
-
|
50
|
+
|
46
51
|
u = NVector.ones(2)
|
47
52
|
v = NVector.zeros(2)
|
48
53
|
w = [1, 0]
|
49
54
|
x = [2, 2]
|
50
55
|
|
51
|
-
|
52
|
-
Measurable
|
53
|
-
Measurable
|
54
|
-
Measurable
|
56
|
+
# Calculate the distance between two points in space.
|
57
|
+
Measurable.euclidean(u, v) # => 1.41421
|
58
|
+
Measurable.euclidean(w, v) # => 1.00000
|
59
|
+
Measurable.cosine([1, 2], [2, 3]) # => 0.00772
|
60
|
+
|
61
|
+
# Calculate the norm of a vector, i.e. its distance from the origin.
|
62
|
+
Measurable.euclidean_squared([3, 4]) # => 25
|
55
63
|
```
|
56
64
|
|
57
|
-
|
65
|
+
## Documentation
|
66
|
+
|
67
|
+
`RDoc` syntax is used to document the project. To build it locally, you'll need to install the [Fivefish generator](https://github.com/ged/rdoc-generator-fivefish) (`gem install rdoc-generator-fivefish`) and run the following command:
|
68
|
+
|
69
|
+
```bash
|
70
|
+
rdoc -f fivefish -m README.md *.md LICENSE lib/
|
71
|
+
```
|
72
|
+
|
73
|
+
I want to be able to use a Rake task to generate the documentation, thus allowing me to forget the specific command. However, there's a bug in `RDoc::Task` in which [custom generators (like Fivefish) can't be used](https://github.com/rdoc/rdoc/issues/246).
|
74
|
+
|
75
|
+
If there's something wrong with an explanation or if there's information missing, please open an issue or send a pull request.
|
58
76
|
|
59
77
|
## License
|
60
78
|
|
61
79
|
See LICENSE for details.
|
62
80
|
|
63
|
-
The original `
|
81
|
+
The original `distance_measures` gem is copyrighted by [@reddavis][reddavis].
|
64
82
|
|
65
|
-
[
|
83
|
+
[maxmin]: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=05156398
|
84
|
+
[reddavis]: (https://github.com/reddavis)
|
data/Rakefile
CHANGED
@@ -1,6 +1,7 @@
|
|
1
1
|
require 'rake'
|
2
2
|
require 'bundler/gem_tasks'
|
3
3
|
require "rspec/core/rake_task"
|
4
|
+
# require 'rdoc/task' # See below.
|
4
5
|
|
5
6
|
# Setup the necessary gems, specified in the gemspec.
|
6
7
|
require 'bundler'
|
@@ -15,10 +16,22 @@ end
|
|
15
16
|
# Run all the specs.
|
16
17
|
RSpec::Core::RakeTask.new(:spec)
|
17
18
|
|
19
|
+
# RDoc task isn't working with custom generators, as can be seen in:
|
20
|
+
# https://github.com/rdoc/rdoc/issues/246
|
21
|
+
#
|
22
|
+
# Whenever this issue is fixed, I'll resume using this task.
|
23
|
+
#
|
24
|
+
# RDoc::Task.new do |rdoc|
|
25
|
+
# rdoc.main = "README.md"
|
26
|
+
# rdoc.rdoc_files.include("README.md", "LICENSE", "lib")
|
27
|
+
# rdoc.generator = "fivefish"
|
28
|
+
# rdoc.external = true
|
29
|
+
# end
|
30
|
+
|
18
31
|
# Compile task.
|
19
32
|
# Rake::ExtensionTask.new do |ext|
|
20
|
-
# ext.name = 'measurable'
|
21
|
-
# ext.ext_dir = 'ext/measurable'
|
33
|
+
# ext.name = 'measurable'
|
34
|
+
# ext.ext_dir = 'ext/measurable'
|
22
35
|
# ext.lib_dir = 'lib/'
|
23
|
-
# ext.source_pattern = "**/*.{c, cpp, h}"
|
36
|
+
# ext.source_pattern = "**/*.{c, cpp, h}"
|
24
37
|
# end
|
data/lib/measurable.rb
CHANGED
@@ -1,47 +1,16 @@
|
|
1
|
-
require 'measurable/version
|
1
|
+
require 'measurable/version'
|
2
2
|
|
3
|
-
# Distance measures.
|
3
|
+
# Distance measures. The require order is important.
|
4
4
|
require 'measurable/euclidean'
|
5
5
|
require 'measurable/cosine'
|
6
|
-
require 'measurable/tanimoto'
|
7
6
|
require 'measurable/jaccard'
|
7
|
+
require 'measurable/tanimoto'
|
8
8
|
require 'measurable/haversine'
|
9
9
|
require 'measurable/maxmin'
|
10
10
|
|
11
11
|
module Measurable
|
12
|
-
# PI
|
13
|
-
RAD_PER_DEG =
|
14
|
-
class << self
|
15
|
-
def binary_union(u, v)
|
16
|
-
unions = []
|
17
|
-
u.each_with_index do |n, index|
|
18
|
-
if n == 1 || v[index] == 1
|
19
|
-
unions << 1
|
20
|
-
else
|
21
|
-
unions << 0
|
22
|
-
end
|
23
|
-
end
|
24
|
-
|
25
|
-
unions
|
26
|
-
end
|
27
|
-
|
28
|
-
def binary_intersection(u, v)
|
29
|
-
intersects = []
|
30
|
-
u.each_with_index do |n, index|
|
31
|
-
if n == 1 && v[index] == 1
|
32
|
-
intersects << 1
|
33
|
-
else
|
34
|
-
intersects << 0
|
35
|
-
end
|
36
|
-
end
|
37
|
-
|
38
|
-
intersects
|
39
|
-
end
|
12
|
+
# PI / 180 degrees.
|
13
|
+
RAD_PER_DEG = Math::PI / 180
|
40
14
|
|
41
|
-
|
42
|
-
# handle NaN"s is set to false
|
43
|
-
def handle_nan(result)
|
44
|
-
result.nan? ? 0.0 : result
|
45
|
-
end
|
46
|
-
end
|
15
|
+
extend self # expose all instance methods as singleton methods.
|
47
16
|
end
|
data/lib/measurable/cosine.rb
CHANGED
@@ -1,10 +1,27 @@
|
|
1
1
|
module Measurable
|
2
|
-
class << self
|
3
|
-
def cosine(u, v)
|
4
|
-
dot_product = dot(u, v)
|
5
|
-
normalization = self.euclidean_normalize * other.euclidean_normalize
|
6
2
|
|
7
|
-
|
8
|
-
|
3
|
+
# call-seq:
|
4
|
+
# cosine(u, v) -> Float
|
5
|
+
#
|
6
|
+
# Calculate the similarity between the orientation of two vectors.
|
7
|
+
#
|
8
|
+
# See: http://en.wikipedia.org/wiki/Cosine_similarity
|
9
|
+
#
|
10
|
+
# * *Arguments* :
|
11
|
+
# - +u+ -> An array of Numeric objects.
|
12
|
+
# - +v+ -> An array of Numeric objects.
|
13
|
+
# * *Returns* :
|
14
|
+
# - The normalized dot product of +u+ and +v+, that is, the angle between
|
15
|
+
# them in the n-dimensional space.
|
16
|
+
# * *Raises* :
|
17
|
+
# - +ArgumentError+ -> The sizes of +u+ and +v+ doesn't match.
|
18
|
+
#
|
19
|
+
def cosine(u, v)
|
20
|
+
# TODO: Change this to a more specific, custom-made exception.
|
21
|
+
raise ArgumentError if u.size != v.size
|
22
|
+
|
23
|
+
dot_product = u.zip(v).reduce(0.0) { |acc, ary| acc += ary[0] * ary[1] }
|
24
|
+
|
25
|
+
dot_product / (euclidean(u) * euclidean(v))
|
9
26
|
end
|
10
27
|
end
|
data/lib/measurable/euclidean.rb
CHANGED
@@ -1,40 +1,76 @@
|
|
1
1
|
module Measurable
|
2
|
-
|
3
|
-
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
2
|
+
|
3
|
+
# call-seq:
|
4
|
+
# euclidean(u) -> Float
|
5
|
+
# euclidean(u, v) -> Float
|
6
|
+
#
|
7
|
+
# Calculate the ordinary distance between arrays +u+ and +v+.
|
8
|
+
#
|
9
|
+
# If +v+ isn't given, calculate the Euclidean norm of +u+.
|
10
|
+
#
|
11
|
+
# See: http://en.wikipedia.org/wiki/Euclidean_distance#N_dimensions
|
12
|
+
#
|
13
|
+
# * *Arguments* :
|
14
|
+
# - +u+ -> An array of Numeric objects.
|
15
|
+
# - +v+ -> (Optional) An array of Numeric objects.
|
16
|
+
# * *Returns* :
|
17
|
+
# - The euclidean norm of +u+ or the euclidean distance between +u+ and
|
18
|
+
# +v+.
|
19
|
+
# * *Raises* :
|
20
|
+
# - +ArgumentError+ -> The sizes of +u+ and +v+ doesn't match.
|
21
|
+
#
|
22
|
+
def euclidean(u, v = nil)
|
23
|
+
# If the second argument is nil, the method should return the norm of
|
24
|
+
# vector u. For this, we need the distance between u and the origin.
|
25
|
+
if v.nil?
|
26
|
+
v = Array.new(u.size, 0)
|
21
27
|
end
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
28
|
-
|
29
|
-
|
30
|
-
|
31
|
-
|
32
|
-
|
33
|
-
|
34
|
-
|
35
|
-
|
36
|
-
|
37
|
-
|
28
|
+
|
29
|
+
# TODO: Change this to a more specific, custom-made exception.
|
30
|
+
raise ArgumentError if u.size != v.size
|
31
|
+
|
32
|
+
sum = u.zip(v).reduce(0.0) do |acc, ary|
|
33
|
+
acc += (ary[0] - ary[-1]) ** 2
|
34
|
+
end
|
35
|
+
|
36
|
+
Math.sqrt(sum)
|
37
|
+
end
|
38
|
+
|
39
|
+
# call-seq:
|
40
|
+
# euclidean_squared(u) -> Float
|
41
|
+
# euclidean_squared(u, v) -> Float
|
42
|
+
#
|
43
|
+
# Calculate the same value as euclidean(u, v), but don't take the square root
|
44
|
+
# of it.
|
45
|
+
#
|
46
|
+
# This isn't a metric in the strict sense, i.e. it doesn't respect the
|
47
|
+
# triangle inequality. However, the squared Euclidean distance is very useful
|
48
|
+
# whenever only the relative values of distances are important, for example
|
49
|
+
# in optimization problems.
|
50
|
+
#
|
51
|
+
# See: http://en.wikipedia.org/wiki/Euclidean_distance#Squared_Euclidean_distance
|
52
|
+
#
|
53
|
+
# * *Arguments* :
|
54
|
+
# - +u+ -> An array of Numeric objects.
|
55
|
+
# - +v+ -> (Optional) An array of Numeric objects.
|
56
|
+
# * *Returns* :
|
57
|
+
# - The squared value of the euclidean norm of +u+ or of the euclidean
|
58
|
+
# distance between +u+ and +v+.
|
59
|
+
# * *Raises* :
|
60
|
+
# - +ArgumentError+ -> The sizes of +u+ and +v+ doesn't match.
|
61
|
+
#
|
62
|
+
def euclidean_squared(u, v = nil)
|
63
|
+
# If the second argument is nil, the method should return the norm of
|
64
|
+
# vector u. For this, we need the distance between u and the origin.
|
65
|
+
if v.nil?
|
66
|
+
v = Array.new(u.size, 0)
|
67
|
+
end
|
68
|
+
|
69
|
+
# TODO: Change this to a more specific, custom-made exception.
|
70
|
+
raise ArgumentError if u.size != v.size
|
71
|
+
|
72
|
+
u.zip(v).reduce(0.0) do |acc, ary|
|
73
|
+
acc += (ary[0] - ary[-1]) ** 2
|
38
74
|
end
|
39
75
|
end
|
40
76
|
end
|
data/lib/measurable/haversine.rb
CHANGED
@@ -1,46 +1,71 @@
|
|
1
|
-
# Notes:
|
2
|
-
#
|
3
|
-
# translated into Ruby based on information contained in:
|
4
|
-
# http://mathforum.org/library/drmath/view/51879.html
|
5
|
-
# Dr. Rick and Dr. Peterson - 4/20/99
|
6
|
-
#
|
7
|
-
# http://www.movable-type.co.uk/scripts/latlong.html
|
8
|
-
# http://en.wikipedia.org/wiki/Haversine_formula
|
9
|
-
#
|
10
|
-
# This formula can compute accurate distances between two points given latitude
|
11
|
-
# and longitude, even for short distances.
|
12
|
-
|
13
1
|
module Measurable
|
14
2
|
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
#
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
3
|
+
# Earth radius in miles.
|
4
|
+
EARTH_RADIUS_IN_MILES = 3956
|
5
|
+
|
6
|
+
# Earth radius in kilometers. Some algorithms use 6367.
|
7
|
+
EARTH_RADIUS_IN_KILOMETERS = 6371
|
8
|
+
|
9
|
+
# The great circle distance returned will be in whatever units R is in.
|
10
|
+
# Provides
|
11
|
+
EARTH_RADIUS = {
|
12
|
+
:miles => EARTH_RADIUS_IN_MILES,
|
13
|
+
:km => EARTH_RADIUS_IN_KILOMETERS,
|
14
|
+
:feet => EARTH_RADIUS_IN_MILES * 5282,
|
15
|
+
:meters => EARTH_RADIUS_IN_KILOMETERS * 1000
|
24
16
|
}
|
25
17
|
|
26
|
-
|
27
|
-
|
28
|
-
|
29
|
-
|
18
|
+
# call-seq:
|
19
|
+
# haversine(u, v) -> Float
|
20
|
+
#
|
21
|
+
# Compute accurate distances between two points given their latitudes and
|
22
|
+
# longitudes, even for short distances. This isn't a distance measure in the
|
23
|
+
# same sense as the other methods in +Measurable+.
|
24
|
+
#
|
25
|
+
# The distance returned is the great circle (or orthodromic) distance between
|
26
|
+
# +u+ and +v+, which is the shortest distance between them on the surface of
|
27
|
+
# a sphere. Thus, this implementation considers the Earth to be a sphere.
|
28
|
+
#
|
29
|
+
# Reminding that the input vectors are of the form [latitude, longitude] in
|
30
|
+
# degrees, so if you have the coordinates [23 32' S, 46 37' W] (from São
|
31
|
+
# Paulo), the corresponding vector is [-23.53333, -46.61667].
|
32
|
+
#
|
33
|
+
# References:
|
34
|
+
# - http://www.movable-type.co.uk/scripts/latlong.html
|
35
|
+
# - http://en.wikipedia.org/wiki/Haversine_formula
|
36
|
+
# - http://en.wikipedia.org/wiki/Great-circle_distance
|
37
|
+
#
|
38
|
+
# * *Arguments* :
|
39
|
+
# - +u+ -> An array of Numeric objects.
|
40
|
+
# - +v+ -> An array of Numeric objects.
|
41
|
+
# - +unit+ -> (Optional) A Symbol representing the unit of measure. Available
|
42
|
+
# options are +:miles+, +:feet+, +:km+ and +:meters+.
|
43
|
+
# * *Returns* :
|
44
|
+
# - The great circle distance between +u+ and +v+.
|
45
|
+
# * *Raises* :
|
46
|
+
# - +ArgumentError+ -> The size of +u+ and +v+ must be 2.
|
47
|
+
# - +ArgumentError+ -> +unit+ must be a Symbol.
|
48
|
+
#
|
49
|
+
def haversine(u, v, unit = :meters)
|
50
|
+
# TODO: Create better exceptions.
|
51
|
+
raise ArgumentError if u.size != 2 || v.size != 2
|
52
|
+
raise ArgumentError if unit.class != Symbol
|
53
|
+
|
54
|
+
dlat = u[0] - v[0]
|
55
|
+
dlon = u[1] - v[1]
|
30
56
|
|
31
|
-
|
32
|
-
|
57
|
+
dlon_rad = dlon * RAD_PER_DEG
|
58
|
+
dlat_rad = dlat * RAD_PER_DEG
|
33
59
|
|
34
|
-
|
35
|
-
|
60
|
+
lat1_rad = v[0] * RAD_PER_DEG
|
61
|
+
lon1_rad = v[1] * RAD_PER_DEG
|
36
62
|
|
37
|
-
|
38
|
-
|
63
|
+
lat2_rad = u[0] * RAD_PER_DEG
|
64
|
+
lon2_rad = u[1] * RAD_PER_DEG
|
39
65
|
|
40
|
-
|
41
|
-
|
66
|
+
a = (Math.sin(dlat_rad / 2)) ** 2 + Math.cos(lat1_rad) * Math.cos(lat2_rad) * (Math.sin(dlon_rad / 2)) ** 2
|
67
|
+
c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1 - a))
|
42
68
|
|
43
|
-
|
44
|
-
end
|
69
|
+
EARTH_RADIUS[unit] * c
|
45
70
|
end
|
46
71
|
end
|
data/lib/measurable/jaccard.rb
CHANGED
@@ -1,26 +1,69 @@
|
|
1
|
-
# http://en.wikipedia.org/wiki/Jaccard_coefficient
|
2
1
|
module Measurable
|
3
|
-
|
4
|
-
|
5
|
-
|
2
|
+
|
3
|
+
# call-seq:
|
4
|
+
# jaccard_index(u, v) -> Float
|
5
|
+
#
|
6
|
+
# Give the similarity between two binary vectors +u+ and +v+. Calculated as:
|
7
|
+
# jaccard_index = |intersection| / |union|
|
8
|
+
#
|
9
|
+
# In which intersection and union refer to +u+ and +v+ and |x| is the
|
10
|
+
# cardinality of set x.
|
11
|
+
#
|
12
|
+
# For example:
|
13
|
+
# jaccard_index([1, 0, 1], [1, 1, 1]) == 0.666...
|
14
|
+
#
|
15
|
+
# Because |intersection| = |(1, 0, 1)| = 2 and |union| = |(1, 1, 1)| = 3.
|
16
|
+
#
|
17
|
+
# See: http://en.wikipedia.org/wiki/Jaccard_coefficient
|
18
|
+
#
|
19
|
+
# * *Arguments* :
|
20
|
+
# - +u+ -> Array of 1s and 0s.
|
21
|
+
# - +v+ -> Array of 1s and 0s.
|
22
|
+
# * *Returns* :
|
23
|
+
# - Float value representing the Jaccard similarity coefficient between
|
24
|
+
# +u+ and +v+.
|
25
|
+
# * *Raises* :
|
26
|
+
# - +ArgumentError+ -> The size of the input arrays doesn't match.
|
27
|
+
#
|
28
|
+
def jaccard_index(u, v)
|
29
|
+
# TODO: Change this to a more specific, custom-made exception.
|
30
|
+
raise ArgumentError if u.size != v.size
|
31
|
+
|
32
|
+
intersection = u.zip(v).reduce(0) do |acc, elem|
|
33
|
+
# Both u and v must have this element.
|
34
|
+
elem[0] + elem[1] == 2 ? (acc + 1) : acc
|
6
35
|
end
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
intersection / union
|
13
|
-
end
|
14
|
-
|
15
|
-
def binary_jaccard(u, v)
|
16
|
-
1 - binary_jaccard_index(u, v)
|
17
|
-
end
|
18
|
-
|
19
|
-
def binary_jaccard_index(u, v)
|
20
|
-
intersection = binary_intersection(u, v).delete_if {|x| x == 0}.size.to_f
|
21
|
-
union = binary_union(u, v).delete_if {|x| x == 0}.size.to_f
|
22
|
-
|
23
|
-
intersection / union
|
36
|
+
|
37
|
+
union = u.zip(v).reduce(0) do |acc, elem|
|
38
|
+
# One of u and v must have this element.
|
39
|
+
elem[0] + elem[1] >= 1 ? (acc + 1) : acc
|
24
40
|
end
|
41
|
+
|
42
|
+
intersection.to_f / union
|
43
|
+
end
|
44
|
+
|
45
|
+
# call-seq:
|
46
|
+
# jaccard(u, v) -> Float
|
47
|
+
#
|
48
|
+
# The jaccard distance is a measure of dissimilarity between two sets. It is
|
49
|
+
# calculated as:
|
50
|
+
# jaccard_distance = 1 - jaccard_index
|
51
|
+
#
|
52
|
+
# This is a proper metric, i.e. the following conditions hold:
|
53
|
+
# - Symmetry: jaccard(u, v) == jaccard(v, u)
|
54
|
+
# - Non-negative: jaccard(u, v) >= 0
|
55
|
+
# - Coincidence axiom: jaccard(u, v) == 0 if u == v
|
56
|
+
# - Triangular inequality: jaccard(u, v) <= jaccard(u, w) + jaccard(w, v)
|
57
|
+
#
|
58
|
+
# * *Arguments* :
|
59
|
+
# - +u+ -> Array of 1s and 0s.
|
60
|
+
# - +v+ -> Array of 1s and 0s.
|
61
|
+
# * *Returns* :
|
62
|
+
# - Float value representing the dissimilarity between +u+ and +v+.
|
63
|
+
# * *Raises* :
|
64
|
+
# - +ArgumentError+ -> The size of the input arrays doesn't match.
|
65
|
+
#
|
66
|
+
def jaccard(u, v)
|
67
|
+
1 - jaccard_index(u, v)
|
25
68
|
end
|
26
69
|
end
|
data/lib/measurable/maxmin.rb
CHANGED
@@ -1,13 +1,34 @@
|
|
1
1
|
module Measurable
|
2
|
-
|
3
|
-
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
|
2
|
+
|
3
|
+
# call-seq:
|
4
|
+
# maxmin(u, v) -> Float
|
5
|
+
#
|
6
|
+
# The "Max-min distance" is used to measure similarity between two vectors.
|
7
|
+
#
|
8
|
+
# When used in k-means clustering, this similarity measure can give better
|
9
|
+
# results in some datasets, as pointed out in the paper "K-means clustering
|
10
|
+
# using Max-min distance measure" --- Visalakshi, N. K.; Suguna, J.
|
11
|
+
#
|
12
|
+
# See: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=05156398
|
13
|
+
#
|
14
|
+
# * *Arguments* :
|
15
|
+
# - +u+ -> An array of Numeric objects.
|
16
|
+
# - +v+ -> An array of Numeric objects.
|
17
|
+
# * *Returns* :
|
18
|
+
# - Similarity between +u+ and +v+.
|
19
|
+
# * *Raises* :
|
20
|
+
# - +ArgumentError+ -> The sizes of +u+ and +v+ doesn't match.
|
21
|
+
#
|
22
|
+
def maxmin(u, v)
|
23
|
+
# TODO: Change this to a more specific, custom-made exception.
|
24
|
+
raise ArgumentError if u.size != v.size
|
25
|
+
|
26
|
+
sum_min, sum_max = u.zip(v).reduce([0.0, 0.0]) do |acc, attributes|
|
27
|
+
acc[0] += attributes.min
|
28
|
+
acc[1] += attributes.max
|
29
|
+
acc
|
11
30
|
end
|
31
|
+
|
32
|
+
sum_min / sum_max
|
12
33
|
end
|
13
34
|
end
|
data/lib/measurable/tanimoto.rb
CHANGED
@@ -1,11 +1,32 @@
|
|
1
|
-
# http://en.wikipedia.org/wiki/Jaccard_index#Tanimoto_coefficient_.28extended_Jaccard_coefficient.29
|
2
1
|
module Measurable
|
3
|
-
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
2
|
+
|
3
|
+
# Tanimoto similarity is the same as Jaccard similarity.
|
4
|
+
alias :tanimoto_similarity :jaccard
|
5
|
+
|
6
|
+
# call-seq:
|
7
|
+
# tanimoto(u, v) -> Float
|
8
|
+
#
|
9
|
+
# Tanimoto distance is a coefficient explicitly chosen such as to allow for
|
10
|
+
# two dissimilar specimens to be similar to a third one. This breaks the
|
11
|
+
# triangle inequality, thus this isn't a metric.
|
12
|
+
#
|
13
|
+
# More information and references on this are needed. It's left here mostly
|
14
|
+
# as a piece of curiosity.
|
15
|
+
#
|
16
|
+
# See: # http://en.wikipedia.org/wiki/Jaccard_index#Tanimoto.27s_Definitions_of_Similarity_and_Distance
|
17
|
+
#
|
18
|
+
# * *Arguments* :
|
19
|
+
# - +u+ -> An array of Numeric objects.
|
20
|
+
# - +v+ -> An array of Numeric objects.
|
21
|
+
# * *Returns* :
|
22
|
+
# - A measure of the similarity between +u+ and +v+.
|
23
|
+
# * *Raises* :
|
24
|
+
# - +ArgumentError+ -> The sizes of +u+ and +v+ doesn't match.
|
25
|
+
#
|
26
|
+
def tanimoto(u, v)
|
27
|
+
# TODO: Change this to a more specific, custom-made exception.
|
28
|
+
raise ArgumentError if u.size != v.size
|
29
|
+
|
30
|
+
-Math.log2(jaccard_index(u, v))
|
10
31
|
end
|
11
32
|
end
|
data/lib/measurable/version.rb
CHANGED
data/spec/cosine_spec.rb
ADDED
@@ -0,0 +1,29 @@
|
|
1
|
+
describe "Cosine distance" do
|
2
|
+
|
3
|
+
before :all do
|
4
|
+
@u = [1, 2]
|
5
|
+
@v = [2, 3]
|
6
|
+
@w = [4, 5]
|
7
|
+
end
|
8
|
+
|
9
|
+
it "accepts two arguments" do
|
10
|
+
expect { Measurable.cosine(@u, @v) }.to_not raise_error
|
11
|
+
expect { Measurable.cosine(@u, @v, @w) }.to raise_error(ArgumentError)
|
12
|
+
end
|
13
|
+
|
14
|
+
it "should be symmetric" do
|
15
|
+
x = Measurable.cosine(@u, @v)
|
16
|
+
y = Measurable.cosine(@v, @u)
|
17
|
+
|
18
|
+
x.should be_within(TOLERANCE).of(y)
|
19
|
+
end
|
20
|
+
|
21
|
+
it "should return the correct value" do
|
22
|
+
x = Measurable.cosine(@u, @v)
|
23
|
+
x.should be_within(TOLERANCE).of(0.992277877)
|
24
|
+
end
|
25
|
+
|
26
|
+
it "shouldn't work with vectors of different length" do
|
27
|
+
expect { Measurable.cosine(@u, [1, 3, 5, 7]) }.to raise_error
|
28
|
+
end
|
29
|
+
end
|
@@ -0,0 +1,61 @@
|
|
1
|
+
describe "Euclidean" do
|
2
|
+
|
3
|
+
before :all do
|
4
|
+
@u = [1, 3, 16]
|
5
|
+
@v = [1, 4, 16]
|
6
|
+
@w = [4, 5, 6]
|
7
|
+
end
|
8
|
+
|
9
|
+
context "Distance" do
|
10
|
+
it "accepts two arguments" do
|
11
|
+
expect { Measurable.euclidean(@u, @v) }.to_not raise_error
|
12
|
+
expect { Measurable.euclidean(@u, @v, @w) }.to raise_error(ArgumentError)
|
13
|
+
end
|
14
|
+
|
15
|
+
it "accepts one argument and returns the vector's norm" do
|
16
|
+
# Remember that 3^2 + 4^2 = 5^2.
|
17
|
+
Measurable.euclidean([3, 4]).should == 5
|
18
|
+
end
|
19
|
+
|
20
|
+
it "should be symmetric" do
|
21
|
+
Measurable.euclidean(@u, @v).should == Measurable.euclidean(@v, @u)
|
22
|
+
end
|
23
|
+
|
24
|
+
it "should return the correct value" do
|
25
|
+
Measurable.euclidean(@u, @u).should == 0
|
26
|
+
Measurable.euclidean(@u, @v).should == 1
|
27
|
+
end
|
28
|
+
|
29
|
+
it "shouldn't work with vectors of different length" do
|
30
|
+
expect { Measurable.euclidean(@u, [2, 2, 2, 2]) }.to raise_error
|
31
|
+
end
|
32
|
+
end
|
33
|
+
|
34
|
+
context "Squared Distance" do
|
35
|
+
it "accepts two arguments" do
|
36
|
+
expect { Measurable.euclidean_squared(@u, @v) }.to_not raise_error
|
37
|
+
expect { Measurable.euclidean_squared(@u, @v, @w) }.to raise_error(ArgumentError)
|
38
|
+
end
|
39
|
+
|
40
|
+
it "accepts one argument and returns the vector's norm" do
|
41
|
+
# Remember that 3^2 + 4^2 = 5^2.
|
42
|
+
Measurable.euclidean_squared([3, 4]).should == 25
|
43
|
+
end
|
44
|
+
|
45
|
+
it "should be symmetric" do
|
46
|
+
x = Measurable.euclidean_squared(@u, @v)
|
47
|
+
y = Measurable.euclidean_squared(@v, @u)
|
48
|
+
|
49
|
+
x.should == y
|
50
|
+
end
|
51
|
+
|
52
|
+
it "should return the correct value" do
|
53
|
+
Measurable.euclidean_squared(@u, @u).should == 0
|
54
|
+
Measurable.euclidean_squared(@u, @v).should == 1
|
55
|
+
end
|
56
|
+
|
57
|
+
it "shouldn't work with vectors of different length" do
|
58
|
+
expect { Measurable.euclidean_squared(@u, [2, 2, 2, 2]) }.to raise_error
|
59
|
+
end
|
60
|
+
end
|
61
|
+
end
|
@@ -0,0 +1,37 @@
|
|
1
|
+
describe "Haversine distance" do
|
2
|
+
|
3
|
+
before :all do
|
4
|
+
# We have very big errors in this formula, due to:
|
5
|
+
# - The Earth is considered a sphere.
|
6
|
+
# - Earth's radius is considered constant (same as above).
|
7
|
+
#
|
8
|
+
# Given these conditions, I'll just assume the error to be less than 1.
|
9
|
+
# TODO: Calculate better error estimates.
|
10
|
+
@haversine_tolerance = 1
|
11
|
+
|
12
|
+
@u = [ 35.66667, 139.75] # Tokyo: 35 40' N, 139 45' E.
|
13
|
+
@v = [-23.53333, -46.61667] # São Paulo: 23 32' S, 46 37' W.
|
14
|
+
end
|
15
|
+
|
16
|
+
it "accepts two arguments" do
|
17
|
+
expect { Measurable.haversine(@u, @v) }.to_not raise_error
|
18
|
+
expect { Measurable.haversine(@u, @v, [-24.5, 40.23]) }.to raise_error(ArgumentError)
|
19
|
+
end
|
20
|
+
|
21
|
+
it "should be symmetric" do
|
22
|
+
x = Measurable.haversine(@u, @v)
|
23
|
+
y = Measurable.haversine(@v, @u)
|
24
|
+
|
25
|
+
x.should be_within(TOLERANCE).of(y)
|
26
|
+
end
|
27
|
+
|
28
|
+
it "should return the correct value" do
|
29
|
+
x = Measurable.haversine(@u, @v, :km)
|
30
|
+
|
31
|
+
x.should be_within(@haversine_tolerance).of(18533)
|
32
|
+
end
|
33
|
+
|
34
|
+
it "should only work with [lat, long] vectors" do
|
35
|
+
expect { Measurable.haversine([2, 4], [1, 3, 5, 7]) }.to raise_error
|
36
|
+
end
|
37
|
+
end
|
@@ -0,0 +1,62 @@
|
|
1
|
+
describe "Jaccard" do
|
2
|
+
|
3
|
+
context "Index" do
|
4
|
+
before :all do
|
5
|
+
@u = [1, 0, 1]
|
6
|
+
@v = [1, 1, 1]
|
7
|
+
@w = [0, 1, 0]
|
8
|
+
end
|
9
|
+
|
10
|
+
it "accepts two arguments" do
|
11
|
+
expect { Measurable.jaccard_index(@u, @v) }.to_not raise_error
|
12
|
+
expect { Measurable.jaccard_index(@u, @v, @w) }.to raise_error(ArgumentError)
|
13
|
+
end
|
14
|
+
|
15
|
+
it "should be symmetric" do
|
16
|
+
x = Measurable.jaccard_index(@u, @v)
|
17
|
+
y = Measurable.jaccard_index(@v, @u)
|
18
|
+
|
19
|
+
x.should be_within(TOLERANCE).of(y)
|
20
|
+
end
|
21
|
+
|
22
|
+
it "should return the correct value" do
|
23
|
+
x = Measurable.jaccard_index(@u, @v)
|
24
|
+
|
25
|
+
x.should be_within(TOLERANCE).of(2.0 / 3.0)
|
26
|
+
end
|
27
|
+
|
28
|
+
it "shouldn't work with vectors of different length" do
|
29
|
+
expect { Measurable.jaccard_index(@u, [1, 2, 3, 4]) }.to raise_error
|
30
|
+
end
|
31
|
+
end
|
32
|
+
|
33
|
+
context "Distance" do
|
34
|
+
before :all do
|
35
|
+
@u = [1, 0, 1]
|
36
|
+
@v = [1, 1, 1]
|
37
|
+
@w = [0, 1, 0]
|
38
|
+
end
|
39
|
+
|
40
|
+
it "accepts two arguments" do
|
41
|
+
expect { Measurable.jaccard(@u, @v) }.to_not raise_error
|
42
|
+
expect { Measurable.jaccard(@u, @v, @w) }.to raise_error(ArgumentError)
|
43
|
+
end
|
44
|
+
|
45
|
+
it "should be symmetric" do
|
46
|
+
x = Measurable.jaccard(@u, @v)
|
47
|
+
y = Measurable.jaccard(@v, @u)
|
48
|
+
|
49
|
+
x.should be_within(TOLERANCE).of(y)
|
50
|
+
end
|
51
|
+
|
52
|
+
it "should return the correct value" do
|
53
|
+
x = Measurable.jaccard(@u, @v)
|
54
|
+
|
55
|
+
x.should be_within(TOLERANCE).of(1.0 / 3.0)
|
56
|
+
end
|
57
|
+
|
58
|
+
it "shouldn't work with vectors of different length" do
|
59
|
+
expect { Measurable.jaccard(@u, [1, 2, 3, 4]) }.to raise_error
|
60
|
+
end
|
61
|
+
end
|
62
|
+
end
|
data/spec/maxmin_spec.rb
ADDED
@@ -0,0 +1,30 @@
|
|
1
|
+
describe "Max-min distance" do
|
2
|
+
|
3
|
+
before :all do
|
4
|
+
@u = [1, 3, 16]
|
5
|
+
@v = [1, 4, 16]
|
6
|
+
@w = [4, 5, 6]
|
7
|
+
end
|
8
|
+
|
9
|
+
it "accepts two arguments" do
|
10
|
+
expect { Measurable.maxmin(@u, @v) }.to_not raise_error
|
11
|
+
expect { Measurable.maxmin(@u, @v, @w) }.to raise_error(ArgumentError)
|
12
|
+
end
|
13
|
+
|
14
|
+
it "should be symmetric" do
|
15
|
+
x = Measurable.maxmin(@u, @v)
|
16
|
+
y = Measurable.maxmin(@v, @u)
|
17
|
+
|
18
|
+
x.should be_within(TOLERANCE).of(y)
|
19
|
+
end
|
20
|
+
|
21
|
+
it "should return the correct value" do
|
22
|
+
x = Measurable.maxmin(@u, @v)
|
23
|
+
|
24
|
+
x.should be_within(TOLERANCE).of(0.9523809523)
|
25
|
+
end
|
26
|
+
|
27
|
+
it "shouldn't work with vectors of different length" do
|
28
|
+
expect { Measurable.maxmin(@u, [1, 3, 5, 7]) }.to raise_error
|
29
|
+
end
|
30
|
+
end
|
data/spec/spec_helper.rb
CHANGED
@@ -0,0 +1,30 @@
|
|
1
|
+
describe "Tanimoto distance" do
|
2
|
+
|
3
|
+
before :all do
|
4
|
+
@u = [1, 0, 1]
|
5
|
+
@v = [1, 1, 1]
|
6
|
+
@w = [0, 1, 0]
|
7
|
+
end
|
8
|
+
|
9
|
+
it "accepts two arguments" do
|
10
|
+
expect { Measurable.tanimoto(@u, @v) }.to_not raise_error
|
11
|
+
expect { Measurable.tanimoto(@u, @v, @w) }.to raise_error(ArgumentError)
|
12
|
+
end
|
13
|
+
|
14
|
+
it "should be symmetric" do
|
15
|
+
x = Measurable.tanimoto(@u, @v)
|
16
|
+
y = Measurable.tanimoto(@v, @u)
|
17
|
+
|
18
|
+
x.should be_within(TOLERANCE).of(y)
|
19
|
+
end
|
20
|
+
|
21
|
+
it "should return the correct value" do
|
22
|
+
x = Measurable.tanimoto(@u, @v)
|
23
|
+
|
24
|
+
x.should be_within(TOLERANCE).of(-Math.log2(2.0 / 3.0))
|
25
|
+
end
|
26
|
+
|
27
|
+
it "shouldn't work with vectors of different length" do
|
28
|
+
expect { Measurable.tanimoto(@u, [1, 3, 5, 7]) }.to raise_error
|
29
|
+
end
|
30
|
+
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: measurable
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
4
|
+
version: 0.0.5
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Carlos Agarie
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2013-
|
11
|
+
date: 2013-07-24 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
@@ -74,8 +74,13 @@ files:
|
|
74
74
|
- lib/measurable/tanimoto.rb
|
75
75
|
- lib/measurable/version.rb
|
76
76
|
- measurable.gemspec
|
77
|
-
- spec/
|
77
|
+
- spec/cosine_spec.rb
|
78
|
+
- spec/euclidean_spec.rb
|
79
|
+
- spec/haversine_spec.rb
|
80
|
+
- spec/jaccard_spec.rb
|
81
|
+
- spec/maxmin_spec.rb
|
78
82
|
- spec/spec_helper.rb
|
83
|
+
- spec/tanimoto_spec.rb
|
79
84
|
homepage: http://github.com/agarie/measurable
|
80
85
|
licenses: []
|
81
86
|
metadata: {}
|
@@ -95,10 +100,15 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
95
100
|
version: '0'
|
96
101
|
requirements: []
|
97
102
|
rubyforge_project:
|
98
|
-
rubygems_version: 2.0.
|
103
|
+
rubygems_version: 2.0.3
|
99
104
|
signing_key:
|
100
105
|
specification_version: 4
|
101
106
|
summary: A Ruby gem with a lot of distance measures for your projects.
|
102
107
|
test_files:
|
103
|
-
- spec/
|
108
|
+
- spec/cosine_spec.rb
|
109
|
+
- spec/euclidean_spec.rb
|
110
|
+
- spec/haversine_spec.rb
|
111
|
+
- spec/jaccard_spec.rb
|
112
|
+
- spec/maxmin_spec.rb
|
104
113
|
- spec/spec_helper.rb
|
114
|
+
- spec/tanimoto_spec.rb
|
data/spec/measurable_spec.rb
DELETED
@@ -1,159 +0,0 @@
|
|
1
|
-
describe Measurable do
|
2
|
-
|
3
|
-
describe "Binary union" do
|
4
|
-
|
5
|
-
end
|
6
|
-
|
7
|
-
describe "Binary intersection" do
|
8
|
-
|
9
|
-
end
|
10
|
-
|
11
|
-
describe "Euclidean" do
|
12
|
-
|
13
|
-
before :all do
|
14
|
-
@u = [1, 3, 16]
|
15
|
-
@v = [1, 4, 16]
|
16
|
-
@w = [4, 5, 6]
|
17
|
-
end
|
18
|
-
|
19
|
-
context "Distance" do
|
20
|
-
it "accepts two arguments" do
|
21
|
-
expect { Measurable.euclidean(@u, @v) }.to_not raise_error
|
22
|
-
expect { Measurable.euclidean(@u, @v, @w) }.to raise_error(ArgumentError)
|
23
|
-
end
|
24
|
-
|
25
|
-
it "accepts one argument and returns the vector's norm" do
|
26
|
-
# Remember that 3^2 + 4^2 = 5^2.
|
27
|
-
Measurable.euclidean([3, 4]).should == 5
|
28
|
-
end
|
29
|
-
|
30
|
-
it "should be symmetric" do
|
31
|
-
Measurable.euclidean(@u, @v).should == Measurable.euclidean(@v, @u)
|
32
|
-
end
|
33
|
-
|
34
|
-
it "should return the correct value" do
|
35
|
-
Measurable.euclidean(@u, @u).should == 0
|
36
|
-
Measurable.euclidean(@u, @v).should == 1
|
37
|
-
end
|
38
|
-
|
39
|
-
it "shouldn't work with vectors of different length" do
|
40
|
-
expect { Measurable.euclidean(@u, [2, 2, 2, 2]) }.to raise_error
|
41
|
-
end
|
42
|
-
end
|
43
|
-
|
44
|
-
context "Squared Distance" do
|
45
|
-
it "accepts two arguments" do
|
46
|
-
expect { Measurable.euclidean_squared(@u, @v) }.to_not raise_error
|
47
|
-
expect { Measurable.euclidean_squared(@u, @v, @w) }.to raise_error(ArgumentError)
|
48
|
-
end
|
49
|
-
|
50
|
-
it "accepts one argument and returns the vector's norm" do
|
51
|
-
# Remember that 3^2 + 4^2 = 5^2.
|
52
|
-
Measurable.euclidean_squared([3, 4]).should == 25
|
53
|
-
end
|
54
|
-
|
55
|
-
it "should be symmetric" do
|
56
|
-
x = Measurable.euclidean_squared(@u, @v)
|
57
|
-
y = Measurable.euclidean_squared(@v, @u)
|
58
|
-
|
59
|
-
x.should == y
|
60
|
-
end
|
61
|
-
|
62
|
-
it "should return the correct value" do
|
63
|
-
Measurable.euclidean_squared(@u, @u).should == 0
|
64
|
-
Measurable.euclidean_squared(@u, @v).should == 1
|
65
|
-
end
|
66
|
-
|
67
|
-
it "shouldn't work with vectors of different length" do
|
68
|
-
expect { Measurable.euclidean_squared(@u, [2, 2, 2, 2]) }.to raise_error
|
69
|
-
end
|
70
|
-
end
|
71
|
-
|
72
|
-
end
|
73
|
-
|
74
|
-
describe "Cosine distance" do
|
75
|
-
it "accepts two arguments"
|
76
|
-
|
77
|
-
it "accepts one argument and returns the vector's norm"
|
78
|
-
|
79
|
-
it "should handle NaN's"
|
80
|
-
|
81
|
-
it "should be symmetric"
|
82
|
-
|
83
|
-
it "should return the correct value"
|
84
|
-
|
85
|
-
it "shouldn't work with vectors of different length"
|
86
|
-
end
|
87
|
-
|
88
|
-
describe "Chebyshev distance" do
|
89
|
-
it "accepts two arguments"
|
90
|
-
|
91
|
-
it "accepts one argument and returns the vector's norm"
|
92
|
-
|
93
|
-
it "should be symmetric"
|
94
|
-
|
95
|
-
it "should return the correct value"
|
96
|
-
|
97
|
-
it "shouldn't work with vectors of different length"
|
98
|
-
end
|
99
|
-
|
100
|
-
describe "Tanimoto distance" do
|
101
|
-
it "accepts two arguments"
|
102
|
-
|
103
|
-
it "accepts one argument and returns the vector's norm"
|
104
|
-
|
105
|
-
it "should be symmetric"
|
106
|
-
|
107
|
-
it "should return the correct value"
|
108
|
-
|
109
|
-
it "shouldn't work with vectors of different length"
|
110
|
-
end
|
111
|
-
|
112
|
-
describe "Haversine distance" do
|
113
|
-
it "accepts two arguments"
|
114
|
-
|
115
|
-
it "accepts one argument and returns the vector's norm"
|
116
|
-
|
117
|
-
it "should be symmetric"
|
118
|
-
|
119
|
-
it "should return the correct value"
|
120
|
-
|
121
|
-
it "shouldn't work with vectors of different length"
|
122
|
-
end
|
123
|
-
|
124
|
-
describe "Jaccard distance" do
|
125
|
-
it "accepts two arguments"
|
126
|
-
|
127
|
-
it "accepts one argument and returns the vector's norm"
|
128
|
-
|
129
|
-
it "should be symmetric"
|
130
|
-
|
131
|
-
it "should return the correct value"
|
132
|
-
|
133
|
-
it "shouldn't work with vectors of different length"
|
134
|
-
end
|
135
|
-
|
136
|
-
describe "Binary Jaccard distance" do
|
137
|
-
it "accepts two arguments"
|
138
|
-
|
139
|
-
it "accepts one argument and returns the vector's norm"
|
140
|
-
|
141
|
-
it "should be symmetric"
|
142
|
-
|
143
|
-
it "should return the correct value"
|
144
|
-
|
145
|
-
it "shouldn't work with vectors of different length"
|
146
|
-
end
|
147
|
-
|
148
|
-
describe "Max-min distance" do
|
149
|
-
it "accepts two arguments"
|
150
|
-
|
151
|
-
it "accepts one argument and returns the vector's norm"
|
152
|
-
|
153
|
-
it "should be symmetric"
|
154
|
-
|
155
|
-
it "should return the correct value"
|
156
|
-
|
157
|
-
it "shouldn't work with vectors of different length"
|
158
|
-
end
|
159
|
-
end
|