measurable 0.0.4 → 0.0.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/Gemfile.lock +1 -1
- data/README.md +35 -16
- data/Rakefile +16 -3
- data/lib/measurable.rb +6 -37
- data/lib/measurable/cosine.rb +23 -6
- data/lib/measurable/euclidean.rb +71 -35
- data/lib/measurable/haversine.rb +60 -35
- data/lib/measurable/jaccard.rb +64 -21
- data/lib/measurable/maxmin.rb +30 -9
- data/lib/measurable/tanimoto.rb +29 -8
- data/lib/measurable/version.rb +1 -1
- data/spec/cosine_spec.rb +29 -0
- data/spec/euclidean_spec.rb +61 -0
- data/spec/haversine_spec.rb +37 -0
- data/spec/jaccard_spec.rb +62 -0
- data/spec/maxmin_spec.rb +30 -0
- data/spec/spec_helper.rb +2 -0
- data/spec/tanimoto_spec.rb +30 -0
- metadata +15 -5
- data/spec/measurable_spec.rb +0 -159
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 66337383a6c25685893bb39f7caf5c7f0b40bcff
|
4
|
+
data.tar.gz: 117a22bb28b1d36f14780d3a22bbad7211b279a0
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 0d7aff51213d2f0ca31472d1d5f43b3ce99d58255b9d3d53485b0983e953866a365927734c1852c7040f7c23149455712af6357679e03ff63bf247709ac7e124
|
7
|
+
data.tar.gz: 652066925d87d7d52656f87346c856d34296e4a98662e94a670d72e9612c568dd95bf758314c524501700b21b53484eaaf059e5b5fee315129fb1b0b90a170bd
|
data/Gemfile.lock
CHANGED
data/README.md
CHANGED
@@ -1,25 +1,32 @@
|
|
1
1
|
# Measurable
|
2
2
|
|
3
|
-
|
3
|
+
A gem to test what metric is best for certain kinds of datasets in machine learning.
|
4
4
|
|
5
|
-
|
5
|
+
Besides the `Array` class, I also want to support `NVector` (from [NMatrix](http://github.com/sciruby/nmatrix)).
|
6
6
|
|
7
|
-
|
7
|
+
The distance measures will be created in Ruby first. If I see that it's really too slow, I'll write some methods in C (or Java, for JRuby).
|
8
|
+
|
9
|
+
This is a fork of the gem [Distance Measure](https://github.com/reddavis/Distance-Measures), which has a similar objective, but isn't actively maintained and doesn't support NMatrix. Thank you, [@reddavis][reddavis]. :)
|
8
10
|
|
9
11
|
## Install
|
10
12
|
|
11
13
|
`gem install measurable`
|
12
14
|
|
13
|
-
|
15
|
+
I only tested it with 2.0.0 (yes, yes, travis, I'll do it eventually). I want to support JRuby as well.
|
16
|
+
|
17
|
+
## Distance measures
|
18
|
+
|
19
|
+
I'm using the term "distance measure" without much concern for the strict mathematical definition of a metric. If the documentation for one of the methods isn't clear about it being or not a metric, please open an issue.
|
14
20
|
|
15
|
-
|
21
|
+
The following are the similarity measures supported at the moment:
|
16
22
|
|
17
23
|
- Euclidean distance
|
18
24
|
- Squared euclidean distance
|
19
25
|
- Cosine distance
|
20
|
-
- Max-min distance (["K-Means clustering using max-min distance measure"][
|
26
|
+
- Max-min distance (from ["K-Means clustering using max-min distance measure"][maxmin])
|
21
27
|
- Jaccard distance
|
22
28
|
- Tanimoto distance
|
29
|
+
- Haversine distance
|
23
30
|
|
24
31
|
These still need to be implemented:
|
25
32
|
|
@@ -36,30 +43,42 @@ These still need to be implemented:
|
|
36
43
|
|
37
44
|
## How to use
|
38
45
|
|
39
|
-
This list will be updated as I have time. I'll refactor the existing measures and add some that I'll need in a project.
|
40
|
-
|
41
46
|
The API I intend to support is something like this:
|
42
47
|
|
43
48
|
```ruby
|
44
49
|
require "measurable"
|
45
|
-
|
50
|
+
|
46
51
|
u = NVector.ones(2)
|
47
52
|
v = NVector.zeros(2)
|
48
53
|
w = [1, 0]
|
49
54
|
x = [2, 2]
|
50
55
|
|
51
|
-
|
52
|
-
Measurable
|
53
|
-
Measurable
|
54
|
-
Measurable
|
56
|
+
# Calculate the distance between two points in space.
|
57
|
+
Measurable.euclidean(u, v) # => 1.41421
|
58
|
+
Measurable.euclidean(w, v) # => 1.00000
|
59
|
+
Measurable.cosine([1, 2], [2, 3]) # => 0.00772
|
60
|
+
|
61
|
+
# Calculate the norm of a vector, i.e. its distance from the origin.
|
62
|
+
Measurable.euclidean_squared([3, 4]) # => 25
|
55
63
|
```
|
56
64
|
|
57
|
-
|
65
|
+
## Documentation
|
66
|
+
|
67
|
+
`RDoc` syntax is used to document the project. To build it locally, you'll need to install the [Fivefish generator](https://github.com/ged/rdoc-generator-fivefish) (`gem install rdoc-generator-fivefish`) and run the following command:
|
68
|
+
|
69
|
+
```bash
|
70
|
+
rdoc -f fivefish -m README.md *.md LICENSE lib/
|
71
|
+
```
|
72
|
+
|
73
|
+
I want to be able to use a Rake task to generate the documentation, thus allowing me to forget the specific command. However, there's a bug in `RDoc::Task` in which [custom generators (like Fivefish) can't be used](https://github.com/rdoc/rdoc/issues/246).
|
74
|
+
|
75
|
+
If there's something wrong with an explanation or if there's information missing, please open an issue or send a pull request.
|
58
76
|
|
59
77
|
## License
|
60
78
|
|
61
79
|
See LICENSE for details.
|
62
80
|
|
63
|
-
The original `
|
81
|
+
The original `distance_measures` gem is copyrighted by [@reddavis][reddavis].
|
64
82
|
|
65
|
-
[
|
83
|
+
[maxmin]: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=05156398
|
84
|
+
[reddavis]: (https://github.com/reddavis)
|
data/Rakefile
CHANGED
@@ -1,6 +1,7 @@
|
|
1
1
|
require 'rake'
|
2
2
|
require 'bundler/gem_tasks'
|
3
3
|
require "rspec/core/rake_task"
|
4
|
+
# require 'rdoc/task' # See below.
|
4
5
|
|
5
6
|
# Setup the necessary gems, specified in the gemspec.
|
6
7
|
require 'bundler'
|
@@ -15,10 +16,22 @@ end
|
|
15
16
|
# Run all the specs.
|
16
17
|
RSpec::Core::RakeTask.new(:spec)
|
17
18
|
|
19
|
+
# RDoc task isn't working with custom generators, as can be seen in:
|
20
|
+
# https://github.com/rdoc/rdoc/issues/246
|
21
|
+
#
|
22
|
+
# Whenever this issue is fixed, I'll resume using this task.
|
23
|
+
#
|
24
|
+
# RDoc::Task.new do |rdoc|
|
25
|
+
# rdoc.main = "README.md"
|
26
|
+
# rdoc.rdoc_files.include("README.md", "LICENSE", "lib")
|
27
|
+
# rdoc.generator = "fivefish"
|
28
|
+
# rdoc.external = true
|
29
|
+
# end
|
30
|
+
|
18
31
|
# Compile task.
|
19
32
|
# Rake::ExtensionTask.new do |ext|
|
20
|
-
# ext.name = 'measurable'
|
21
|
-
# ext.ext_dir = 'ext/measurable'
|
33
|
+
# ext.name = 'measurable'
|
34
|
+
# ext.ext_dir = 'ext/measurable'
|
22
35
|
# ext.lib_dir = 'lib/'
|
23
|
-
# ext.source_pattern = "**/*.{c, cpp, h}"
|
36
|
+
# ext.source_pattern = "**/*.{c, cpp, h}"
|
24
37
|
# end
|
data/lib/measurable.rb
CHANGED
@@ -1,47 +1,16 @@
|
|
1
|
-
require 'measurable/version
|
1
|
+
require 'measurable/version'
|
2
2
|
|
3
|
-
# Distance measures.
|
3
|
+
# Distance measures. The require order is important.
|
4
4
|
require 'measurable/euclidean'
|
5
5
|
require 'measurable/cosine'
|
6
|
-
require 'measurable/tanimoto'
|
7
6
|
require 'measurable/jaccard'
|
7
|
+
require 'measurable/tanimoto'
|
8
8
|
require 'measurable/haversine'
|
9
9
|
require 'measurable/maxmin'
|
10
10
|
|
11
11
|
module Measurable
|
12
|
-
# PI
|
13
|
-
RAD_PER_DEG =
|
14
|
-
class << self
|
15
|
-
def binary_union(u, v)
|
16
|
-
unions = []
|
17
|
-
u.each_with_index do |n, index|
|
18
|
-
if n == 1 || v[index] == 1
|
19
|
-
unions << 1
|
20
|
-
else
|
21
|
-
unions << 0
|
22
|
-
end
|
23
|
-
end
|
24
|
-
|
25
|
-
unions
|
26
|
-
end
|
27
|
-
|
28
|
-
def binary_intersection(u, v)
|
29
|
-
intersects = []
|
30
|
-
u.each_with_index do |n, index|
|
31
|
-
if n == 1 && v[index] == 1
|
32
|
-
intersects << 1
|
33
|
-
else
|
34
|
-
intersects << 0
|
35
|
-
end
|
36
|
-
end
|
37
|
-
|
38
|
-
intersects
|
39
|
-
end
|
12
|
+
# PI / 180 degrees.
|
13
|
+
RAD_PER_DEG = Math::PI / 180
|
40
14
|
|
41
|
-
|
42
|
-
# handle NaN"s is set to false
|
43
|
-
def handle_nan(result)
|
44
|
-
result.nan? ? 0.0 : result
|
45
|
-
end
|
46
|
-
end
|
15
|
+
extend self # expose all instance methods as singleton methods.
|
47
16
|
end
|
data/lib/measurable/cosine.rb
CHANGED
@@ -1,10 +1,27 @@
|
|
1
1
|
module Measurable
|
2
|
-
class << self
|
3
|
-
def cosine(u, v)
|
4
|
-
dot_product = dot(u, v)
|
5
|
-
normalization = self.euclidean_normalize * other.euclidean_normalize
|
6
2
|
|
7
|
-
|
8
|
-
|
3
|
+
# call-seq:
|
4
|
+
# cosine(u, v) -> Float
|
5
|
+
#
|
6
|
+
# Calculate the similarity between the orientation of two vectors.
|
7
|
+
#
|
8
|
+
# See: http://en.wikipedia.org/wiki/Cosine_similarity
|
9
|
+
#
|
10
|
+
# * *Arguments* :
|
11
|
+
# - +u+ -> An array of Numeric objects.
|
12
|
+
# - +v+ -> An array of Numeric objects.
|
13
|
+
# * *Returns* :
|
14
|
+
# - The normalized dot product of +u+ and +v+, that is, the angle between
|
15
|
+
# them in the n-dimensional space.
|
16
|
+
# * *Raises* :
|
17
|
+
# - +ArgumentError+ -> The sizes of +u+ and +v+ doesn't match.
|
18
|
+
#
|
19
|
+
def cosine(u, v)
|
20
|
+
# TODO: Change this to a more specific, custom-made exception.
|
21
|
+
raise ArgumentError if u.size != v.size
|
22
|
+
|
23
|
+
dot_product = u.zip(v).reduce(0.0) { |acc, ary| acc += ary[0] * ary[1] }
|
24
|
+
|
25
|
+
dot_product / (euclidean(u) * euclidean(v))
|
9
26
|
end
|
10
27
|
end
|
data/lib/measurable/euclidean.rb
CHANGED
@@ -1,40 +1,76 @@
|
|
1
1
|
module Measurable
|
2
|
-
|
3
|
-
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
2
|
+
|
3
|
+
# call-seq:
|
4
|
+
# euclidean(u) -> Float
|
5
|
+
# euclidean(u, v) -> Float
|
6
|
+
#
|
7
|
+
# Calculate the ordinary distance between arrays +u+ and +v+.
|
8
|
+
#
|
9
|
+
# If +v+ isn't given, calculate the Euclidean norm of +u+.
|
10
|
+
#
|
11
|
+
# See: http://en.wikipedia.org/wiki/Euclidean_distance#N_dimensions
|
12
|
+
#
|
13
|
+
# * *Arguments* :
|
14
|
+
# - +u+ -> An array of Numeric objects.
|
15
|
+
# - +v+ -> (Optional) An array of Numeric objects.
|
16
|
+
# * *Returns* :
|
17
|
+
# - The euclidean norm of +u+ or the euclidean distance between +u+ and
|
18
|
+
# +v+.
|
19
|
+
# * *Raises* :
|
20
|
+
# - +ArgumentError+ -> The sizes of +u+ and +v+ doesn't match.
|
21
|
+
#
|
22
|
+
def euclidean(u, v = nil)
|
23
|
+
# If the second argument is nil, the method should return the norm of
|
24
|
+
# vector u. For this, we need the distance between u and the origin.
|
25
|
+
if v.nil?
|
26
|
+
v = Array.new(u.size, 0)
|
21
27
|
end
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
28
|
-
|
29
|
-
|
30
|
-
|
31
|
-
|
32
|
-
|
33
|
-
|
34
|
-
|
35
|
-
|
36
|
-
|
37
|
-
|
28
|
+
|
29
|
+
# TODO: Change this to a more specific, custom-made exception.
|
30
|
+
raise ArgumentError if u.size != v.size
|
31
|
+
|
32
|
+
sum = u.zip(v).reduce(0.0) do |acc, ary|
|
33
|
+
acc += (ary[0] - ary[-1]) ** 2
|
34
|
+
end
|
35
|
+
|
36
|
+
Math.sqrt(sum)
|
37
|
+
end
|
38
|
+
|
39
|
+
# call-seq:
|
40
|
+
# euclidean_squared(u) -> Float
|
41
|
+
# euclidean_squared(u, v) -> Float
|
42
|
+
#
|
43
|
+
# Calculate the same value as euclidean(u, v), but don't take the square root
|
44
|
+
# of it.
|
45
|
+
#
|
46
|
+
# This isn't a metric in the strict sense, i.e. it doesn't respect the
|
47
|
+
# triangle inequality. However, the squared Euclidean distance is very useful
|
48
|
+
# whenever only the relative values of distances are important, for example
|
49
|
+
# in optimization problems.
|
50
|
+
#
|
51
|
+
# See: http://en.wikipedia.org/wiki/Euclidean_distance#Squared_Euclidean_distance
|
52
|
+
#
|
53
|
+
# * *Arguments* :
|
54
|
+
# - +u+ -> An array of Numeric objects.
|
55
|
+
# - +v+ -> (Optional) An array of Numeric objects.
|
56
|
+
# * *Returns* :
|
57
|
+
# - The squared value of the euclidean norm of +u+ or of the euclidean
|
58
|
+
# distance between +u+ and +v+.
|
59
|
+
# * *Raises* :
|
60
|
+
# - +ArgumentError+ -> The sizes of +u+ and +v+ doesn't match.
|
61
|
+
#
|
62
|
+
def euclidean_squared(u, v = nil)
|
63
|
+
# If the second argument is nil, the method should return the norm of
|
64
|
+
# vector u. For this, we need the distance between u and the origin.
|
65
|
+
if v.nil?
|
66
|
+
v = Array.new(u.size, 0)
|
67
|
+
end
|
68
|
+
|
69
|
+
# TODO: Change this to a more specific, custom-made exception.
|
70
|
+
raise ArgumentError if u.size != v.size
|
71
|
+
|
72
|
+
u.zip(v).reduce(0.0) do |acc, ary|
|
73
|
+
acc += (ary[0] - ary[-1]) ** 2
|
38
74
|
end
|
39
75
|
end
|
40
76
|
end
|
data/lib/measurable/haversine.rb
CHANGED
@@ -1,46 +1,71 @@
|
|
1
|
-
# Notes:
|
2
|
-
#
|
3
|
-
# translated into Ruby based on information contained in:
|
4
|
-
# http://mathforum.org/library/drmath/view/51879.html
|
5
|
-
# Dr. Rick and Dr. Peterson - 4/20/99
|
6
|
-
#
|
7
|
-
# http://www.movable-type.co.uk/scripts/latlong.html
|
8
|
-
# http://en.wikipedia.org/wiki/Haversine_formula
|
9
|
-
#
|
10
|
-
# This formula can compute accurate distances between two points given latitude
|
11
|
-
# and longitude, even for short distances.
|
12
|
-
|
13
1
|
module Measurable
|
14
2
|
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
#
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
3
|
+
# Earth radius in miles.
|
4
|
+
EARTH_RADIUS_IN_MILES = 3956
|
5
|
+
|
6
|
+
# Earth radius in kilometers. Some algorithms use 6367.
|
7
|
+
EARTH_RADIUS_IN_KILOMETERS = 6371
|
8
|
+
|
9
|
+
# The great circle distance returned will be in whatever units R is in.
|
10
|
+
# Provides
|
11
|
+
EARTH_RADIUS = {
|
12
|
+
:miles => EARTH_RADIUS_IN_MILES,
|
13
|
+
:km => EARTH_RADIUS_IN_KILOMETERS,
|
14
|
+
:feet => EARTH_RADIUS_IN_MILES * 5282,
|
15
|
+
:meters => EARTH_RADIUS_IN_KILOMETERS * 1000
|
24
16
|
}
|
25
17
|
|
26
|
-
|
27
|
-
|
28
|
-
|
29
|
-
|
18
|
+
# call-seq:
|
19
|
+
# haversine(u, v) -> Float
|
20
|
+
#
|
21
|
+
# Compute accurate distances between two points given their latitudes and
|
22
|
+
# longitudes, even for short distances. This isn't a distance measure in the
|
23
|
+
# same sense as the other methods in +Measurable+.
|
24
|
+
#
|
25
|
+
# The distance returned is the great circle (or orthodromic) distance between
|
26
|
+
# +u+ and +v+, which is the shortest distance between them on the surface of
|
27
|
+
# a sphere. Thus, this implementation considers the Earth to be a sphere.
|
28
|
+
#
|
29
|
+
# Reminding that the input vectors are of the form [latitude, longitude] in
|
30
|
+
# degrees, so if you have the coordinates [23 32' S, 46 37' W] (from São
|
31
|
+
# Paulo), the corresponding vector is [-23.53333, -46.61667].
|
32
|
+
#
|
33
|
+
# References:
|
34
|
+
# - http://www.movable-type.co.uk/scripts/latlong.html
|
35
|
+
# - http://en.wikipedia.org/wiki/Haversine_formula
|
36
|
+
# - http://en.wikipedia.org/wiki/Great-circle_distance
|
37
|
+
#
|
38
|
+
# * *Arguments* :
|
39
|
+
# - +u+ -> An array of Numeric objects.
|
40
|
+
# - +v+ -> An array of Numeric objects.
|
41
|
+
# - +unit+ -> (Optional) A Symbol representing the unit of measure. Available
|
42
|
+
# options are +:miles+, +:feet+, +:km+ and +:meters+.
|
43
|
+
# * *Returns* :
|
44
|
+
# - The great circle distance between +u+ and +v+.
|
45
|
+
# * *Raises* :
|
46
|
+
# - +ArgumentError+ -> The size of +u+ and +v+ must be 2.
|
47
|
+
# - +ArgumentError+ -> +unit+ must be a Symbol.
|
48
|
+
#
|
49
|
+
def haversine(u, v, unit = :meters)
|
50
|
+
# TODO: Create better exceptions.
|
51
|
+
raise ArgumentError if u.size != 2 || v.size != 2
|
52
|
+
raise ArgumentError if unit.class != Symbol
|
53
|
+
|
54
|
+
dlat = u[0] - v[0]
|
55
|
+
dlon = u[1] - v[1]
|
30
56
|
|
31
|
-
|
32
|
-
|
57
|
+
dlon_rad = dlon * RAD_PER_DEG
|
58
|
+
dlat_rad = dlat * RAD_PER_DEG
|
33
59
|
|
34
|
-
|
35
|
-
|
60
|
+
lat1_rad = v[0] * RAD_PER_DEG
|
61
|
+
lon1_rad = v[1] * RAD_PER_DEG
|
36
62
|
|
37
|
-
|
38
|
-
|
63
|
+
lat2_rad = u[0] * RAD_PER_DEG
|
64
|
+
lon2_rad = u[1] * RAD_PER_DEG
|
39
65
|
|
40
|
-
|
41
|
-
|
66
|
+
a = (Math.sin(dlat_rad / 2)) ** 2 + Math.cos(lat1_rad) * Math.cos(lat2_rad) * (Math.sin(dlon_rad / 2)) ** 2
|
67
|
+
c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1 - a))
|
42
68
|
|
43
|
-
|
44
|
-
end
|
69
|
+
EARTH_RADIUS[unit] * c
|
45
70
|
end
|
46
71
|
end
|
data/lib/measurable/jaccard.rb
CHANGED
@@ -1,26 +1,69 @@
|
|
1
|
-
# http://en.wikipedia.org/wiki/Jaccard_coefficient
|
2
1
|
module Measurable
|
3
|
-
|
4
|
-
|
5
|
-
|
2
|
+
|
3
|
+
# call-seq:
|
4
|
+
# jaccard_index(u, v) -> Float
|
5
|
+
#
|
6
|
+
# Give the similarity between two binary vectors +u+ and +v+. Calculated as:
|
7
|
+
# jaccard_index = |intersection| / |union|
|
8
|
+
#
|
9
|
+
# In which intersection and union refer to +u+ and +v+ and |x| is the
|
10
|
+
# cardinality of set x.
|
11
|
+
#
|
12
|
+
# For example:
|
13
|
+
# jaccard_index([1, 0, 1], [1, 1, 1]) == 0.666...
|
14
|
+
#
|
15
|
+
# Because |intersection| = |(1, 0, 1)| = 2 and |union| = |(1, 1, 1)| = 3.
|
16
|
+
#
|
17
|
+
# See: http://en.wikipedia.org/wiki/Jaccard_coefficient
|
18
|
+
#
|
19
|
+
# * *Arguments* :
|
20
|
+
# - +u+ -> Array of 1s and 0s.
|
21
|
+
# - +v+ -> Array of 1s and 0s.
|
22
|
+
# * *Returns* :
|
23
|
+
# - Float value representing the Jaccard similarity coefficient between
|
24
|
+
# +u+ and +v+.
|
25
|
+
# * *Raises* :
|
26
|
+
# - +ArgumentError+ -> The size of the input arrays doesn't match.
|
27
|
+
#
|
28
|
+
def jaccard_index(u, v)
|
29
|
+
# TODO: Change this to a more specific, custom-made exception.
|
30
|
+
raise ArgumentError if u.size != v.size
|
31
|
+
|
32
|
+
intersection = u.zip(v).reduce(0) do |acc, elem|
|
33
|
+
# Both u and v must have this element.
|
34
|
+
elem[0] + elem[1] == 2 ? (acc + 1) : acc
|
6
35
|
end
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
intersection / union
|
13
|
-
end
|
14
|
-
|
15
|
-
def binary_jaccard(u, v)
|
16
|
-
1 - binary_jaccard_index(u, v)
|
17
|
-
end
|
18
|
-
|
19
|
-
def binary_jaccard_index(u, v)
|
20
|
-
intersection = binary_intersection(u, v).delete_if {|x| x == 0}.size.to_f
|
21
|
-
union = binary_union(u, v).delete_if {|x| x == 0}.size.to_f
|
22
|
-
|
23
|
-
intersection / union
|
36
|
+
|
37
|
+
union = u.zip(v).reduce(0) do |acc, elem|
|
38
|
+
# One of u and v must have this element.
|
39
|
+
elem[0] + elem[1] >= 1 ? (acc + 1) : acc
|
24
40
|
end
|
41
|
+
|
42
|
+
intersection.to_f / union
|
43
|
+
end
|
44
|
+
|
45
|
+
# call-seq:
|
46
|
+
# jaccard(u, v) -> Float
|
47
|
+
#
|
48
|
+
# The jaccard distance is a measure of dissimilarity between two sets. It is
|
49
|
+
# calculated as:
|
50
|
+
# jaccard_distance = 1 - jaccard_index
|
51
|
+
#
|
52
|
+
# This is a proper metric, i.e. the following conditions hold:
|
53
|
+
# - Symmetry: jaccard(u, v) == jaccard(v, u)
|
54
|
+
# - Non-negative: jaccard(u, v) >= 0
|
55
|
+
# - Coincidence axiom: jaccard(u, v) == 0 if u == v
|
56
|
+
# - Triangular inequality: jaccard(u, v) <= jaccard(u, w) + jaccard(w, v)
|
57
|
+
#
|
58
|
+
# * *Arguments* :
|
59
|
+
# - +u+ -> Array of 1s and 0s.
|
60
|
+
# - +v+ -> Array of 1s and 0s.
|
61
|
+
# * *Returns* :
|
62
|
+
# - Float value representing the dissimilarity between +u+ and +v+.
|
63
|
+
# * *Raises* :
|
64
|
+
# - +ArgumentError+ -> The size of the input arrays doesn't match.
|
65
|
+
#
|
66
|
+
def jaccard(u, v)
|
67
|
+
1 - jaccard_index(u, v)
|
25
68
|
end
|
26
69
|
end
|
data/lib/measurable/maxmin.rb
CHANGED
@@ -1,13 +1,34 @@
|
|
1
1
|
module Measurable
|
2
|
-
|
3
|
-
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
|
2
|
+
|
3
|
+
# call-seq:
|
4
|
+
# maxmin(u, v) -> Float
|
5
|
+
#
|
6
|
+
# The "Max-min distance" is used to measure similarity between two vectors.
|
7
|
+
#
|
8
|
+
# When used in k-means clustering, this similarity measure can give better
|
9
|
+
# results in some datasets, as pointed out in the paper "K-means clustering
|
10
|
+
# using Max-min distance measure" --- Visalakshi, N. K.; Suguna, J.
|
11
|
+
#
|
12
|
+
# See: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=05156398
|
13
|
+
#
|
14
|
+
# * *Arguments* :
|
15
|
+
# - +u+ -> An array of Numeric objects.
|
16
|
+
# - +v+ -> An array of Numeric objects.
|
17
|
+
# * *Returns* :
|
18
|
+
# - Similarity between +u+ and +v+.
|
19
|
+
# * *Raises* :
|
20
|
+
# - +ArgumentError+ -> The sizes of +u+ and +v+ doesn't match.
|
21
|
+
#
|
22
|
+
def maxmin(u, v)
|
23
|
+
# TODO: Change this to a more specific, custom-made exception.
|
24
|
+
raise ArgumentError if u.size != v.size
|
25
|
+
|
26
|
+
sum_min, sum_max = u.zip(v).reduce([0.0, 0.0]) do |acc, attributes|
|
27
|
+
acc[0] += attributes.min
|
28
|
+
acc[1] += attributes.max
|
29
|
+
acc
|
11
30
|
end
|
31
|
+
|
32
|
+
sum_min / sum_max
|
12
33
|
end
|
13
34
|
end
|
data/lib/measurable/tanimoto.rb
CHANGED
@@ -1,11 +1,32 @@
|
|
1
|
-
# http://en.wikipedia.org/wiki/Jaccard_index#Tanimoto_coefficient_.28extended_Jaccard_coefficient.29
|
2
1
|
module Measurable
|
3
|
-
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
2
|
+
|
3
|
+
# Tanimoto similarity is the same as Jaccard similarity.
|
4
|
+
alias :tanimoto_similarity :jaccard
|
5
|
+
|
6
|
+
# call-seq:
|
7
|
+
# tanimoto(u, v) -> Float
|
8
|
+
#
|
9
|
+
# Tanimoto distance is a coefficient explicitly chosen such as to allow for
|
10
|
+
# two dissimilar specimens to be similar to a third one. This breaks the
|
11
|
+
# triangle inequality, thus this isn't a metric.
|
12
|
+
#
|
13
|
+
# More information and references on this are needed. It's left here mostly
|
14
|
+
# as a piece of curiosity.
|
15
|
+
#
|
16
|
+
# See: # http://en.wikipedia.org/wiki/Jaccard_index#Tanimoto.27s_Definitions_of_Similarity_and_Distance
|
17
|
+
#
|
18
|
+
# * *Arguments* :
|
19
|
+
# - +u+ -> An array of Numeric objects.
|
20
|
+
# - +v+ -> An array of Numeric objects.
|
21
|
+
# * *Returns* :
|
22
|
+
# - A measure of the similarity between +u+ and +v+.
|
23
|
+
# * *Raises* :
|
24
|
+
# - +ArgumentError+ -> The sizes of +u+ and +v+ doesn't match.
|
25
|
+
#
|
26
|
+
def tanimoto(u, v)
|
27
|
+
# TODO: Change this to a more specific, custom-made exception.
|
28
|
+
raise ArgumentError if u.size != v.size
|
29
|
+
|
30
|
+
-Math.log2(jaccard_index(u, v))
|
10
31
|
end
|
11
32
|
end
|
data/lib/measurable/version.rb
CHANGED
data/spec/cosine_spec.rb
ADDED
@@ -0,0 +1,29 @@
|
|
1
|
+
describe "Cosine distance" do
|
2
|
+
|
3
|
+
before :all do
|
4
|
+
@u = [1, 2]
|
5
|
+
@v = [2, 3]
|
6
|
+
@w = [4, 5]
|
7
|
+
end
|
8
|
+
|
9
|
+
it "accepts two arguments" do
|
10
|
+
expect { Measurable.cosine(@u, @v) }.to_not raise_error
|
11
|
+
expect { Measurable.cosine(@u, @v, @w) }.to raise_error(ArgumentError)
|
12
|
+
end
|
13
|
+
|
14
|
+
it "should be symmetric" do
|
15
|
+
x = Measurable.cosine(@u, @v)
|
16
|
+
y = Measurable.cosine(@v, @u)
|
17
|
+
|
18
|
+
x.should be_within(TOLERANCE).of(y)
|
19
|
+
end
|
20
|
+
|
21
|
+
it "should return the correct value" do
|
22
|
+
x = Measurable.cosine(@u, @v)
|
23
|
+
x.should be_within(TOLERANCE).of(0.992277877)
|
24
|
+
end
|
25
|
+
|
26
|
+
it "shouldn't work with vectors of different length" do
|
27
|
+
expect { Measurable.cosine(@u, [1, 3, 5, 7]) }.to raise_error
|
28
|
+
end
|
29
|
+
end
|
@@ -0,0 +1,61 @@
|
|
1
|
+
describe "Euclidean" do
|
2
|
+
|
3
|
+
before :all do
|
4
|
+
@u = [1, 3, 16]
|
5
|
+
@v = [1, 4, 16]
|
6
|
+
@w = [4, 5, 6]
|
7
|
+
end
|
8
|
+
|
9
|
+
context "Distance" do
|
10
|
+
it "accepts two arguments" do
|
11
|
+
expect { Measurable.euclidean(@u, @v) }.to_not raise_error
|
12
|
+
expect { Measurable.euclidean(@u, @v, @w) }.to raise_error(ArgumentError)
|
13
|
+
end
|
14
|
+
|
15
|
+
it "accepts one argument and returns the vector's norm" do
|
16
|
+
# Remember that 3^2 + 4^2 = 5^2.
|
17
|
+
Measurable.euclidean([3, 4]).should == 5
|
18
|
+
end
|
19
|
+
|
20
|
+
it "should be symmetric" do
|
21
|
+
Measurable.euclidean(@u, @v).should == Measurable.euclidean(@v, @u)
|
22
|
+
end
|
23
|
+
|
24
|
+
it "should return the correct value" do
|
25
|
+
Measurable.euclidean(@u, @u).should == 0
|
26
|
+
Measurable.euclidean(@u, @v).should == 1
|
27
|
+
end
|
28
|
+
|
29
|
+
it "shouldn't work with vectors of different length" do
|
30
|
+
expect { Measurable.euclidean(@u, [2, 2, 2, 2]) }.to raise_error
|
31
|
+
end
|
32
|
+
end
|
33
|
+
|
34
|
+
context "Squared Distance" do
|
35
|
+
it "accepts two arguments" do
|
36
|
+
expect { Measurable.euclidean_squared(@u, @v) }.to_not raise_error
|
37
|
+
expect { Measurable.euclidean_squared(@u, @v, @w) }.to raise_error(ArgumentError)
|
38
|
+
end
|
39
|
+
|
40
|
+
it "accepts one argument and returns the vector's norm" do
|
41
|
+
# Remember that 3^2 + 4^2 = 5^2.
|
42
|
+
Measurable.euclidean_squared([3, 4]).should == 25
|
43
|
+
end
|
44
|
+
|
45
|
+
it "should be symmetric" do
|
46
|
+
x = Measurable.euclidean_squared(@u, @v)
|
47
|
+
y = Measurable.euclidean_squared(@v, @u)
|
48
|
+
|
49
|
+
x.should == y
|
50
|
+
end
|
51
|
+
|
52
|
+
it "should return the correct value" do
|
53
|
+
Measurable.euclidean_squared(@u, @u).should == 0
|
54
|
+
Measurable.euclidean_squared(@u, @v).should == 1
|
55
|
+
end
|
56
|
+
|
57
|
+
it "shouldn't work with vectors of different length" do
|
58
|
+
expect { Measurable.euclidean_squared(@u, [2, 2, 2, 2]) }.to raise_error
|
59
|
+
end
|
60
|
+
end
|
61
|
+
end
|
@@ -0,0 +1,37 @@
|
|
1
|
+
describe "Haversine distance" do
|
2
|
+
|
3
|
+
before :all do
|
4
|
+
# We have very big errors in this formula, due to:
|
5
|
+
# - The Earth is considered a sphere.
|
6
|
+
# - Earth's radius is considered constant (same as above).
|
7
|
+
#
|
8
|
+
# Given these conditions, I'll just assume the error to be less than 1.
|
9
|
+
# TODO: Calculate better error estimates.
|
10
|
+
@haversine_tolerance = 1
|
11
|
+
|
12
|
+
@u = [ 35.66667, 139.75] # Tokyo: 35 40' N, 139 45' E.
|
13
|
+
@v = [-23.53333, -46.61667] # São Paulo: 23 32' S, 46 37' W.
|
14
|
+
end
|
15
|
+
|
16
|
+
it "accepts two arguments" do
|
17
|
+
expect { Measurable.haversine(@u, @v) }.to_not raise_error
|
18
|
+
expect { Measurable.haversine(@u, @v, [-24.5, 40.23]) }.to raise_error(ArgumentError)
|
19
|
+
end
|
20
|
+
|
21
|
+
it "should be symmetric" do
|
22
|
+
x = Measurable.haversine(@u, @v)
|
23
|
+
y = Measurable.haversine(@v, @u)
|
24
|
+
|
25
|
+
x.should be_within(TOLERANCE).of(y)
|
26
|
+
end
|
27
|
+
|
28
|
+
it "should return the correct value" do
|
29
|
+
x = Measurable.haversine(@u, @v, :km)
|
30
|
+
|
31
|
+
x.should be_within(@haversine_tolerance).of(18533)
|
32
|
+
end
|
33
|
+
|
34
|
+
it "should only work with [lat, long] vectors" do
|
35
|
+
expect { Measurable.haversine([2, 4], [1, 3, 5, 7]) }.to raise_error
|
36
|
+
end
|
37
|
+
end
|
@@ -0,0 +1,62 @@
|
|
1
|
+
describe "Jaccard" do
|
2
|
+
|
3
|
+
context "Index" do
|
4
|
+
before :all do
|
5
|
+
@u = [1, 0, 1]
|
6
|
+
@v = [1, 1, 1]
|
7
|
+
@w = [0, 1, 0]
|
8
|
+
end
|
9
|
+
|
10
|
+
it "accepts two arguments" do
|
11
|
+
expect { Measurable.jaccard_index(@u, @v) }.to_not raise_error
|
12
|
+
expect { Measurable.jaccard_index(@u, @v, @w) }.to raise_error(ArgumentError)
|
13
|
+
end
|
14
|
+
|
15
|
+
it "should be symmetric" do
|
16
|
+
x = Measurable.jaccard_index(@u, @v)
|
17
|
+
y = Measurable.jaccard_index(@v, @u)
|
18
|
+
|
19
|
+
x.should be_within(TOLERANCE).of(y)
|
20
|
+
end
|
21
|
+
|
22
|
+
it "should return the correct value" do
|
23
|
+
x = Measurable.jaccard_index(@u, @v)
|
24
|
+
|
25
|
+
x.should be_within(TOLERANCE).of(2.0 / 3.0)
|
26
|
+
end
|
27
|
+
|
28
|
+
it "shouldn't work with vectors of different length" do
|
29
|
+
expect { Measurable.jaccard_index(@u, [1, 2, 3, 4]) }.to raise_error
|
30
|
+
end
|
31
|
+
end
|
32
|
+
|
33
|
+
context "Distance" do
|
34
|
+
before :all do
|
35
|
+
@u = [1, 0, 1]
|
36
|
+
@v = [1, 1, 1]
|
37
|
+
@w = [0, 1, 0]
|
38
|
+
end
|
39
|
+
|
40
|
+
it "accepts two arguments" do
|
41
|
+
expect { Measurable.jaccard(@u, @v) }.to_not raise_error
|
42
|
+
expect { Measurable.jaccard(@u, @v, @w) }.to raise_error(ArgumentError)
|
43
|
+
end
|
44
|
+
|
45
|
+
it "should be symmetric" do
|
46
|
+
x = Measurable.jaccard(@u, @v)
|
47
|
+
y = Measurable.jaccard(@v, @u)
|
48
|
+
|
49
|
+
x.should be_within(TOLERANCE).of(y)
|
50
|
+
end
|
51
|
+
|
52
|
+
it "should return the correct value" do
|
53
|
+
x = Measurable.jaccard(@u, @v)
|
54
|
+
|
55
|
+
x.should be_within(TOLERANCE).of(1.0 / 3.0)
|
56
|
+
end
|
57
|
+
|
58
|
+
it "shouldn't work with vectors of different length" do
|
59
|
+
expect { Measurable.jaccard(@u, [1, 2, 3, 4]) }.to raise_error
|
60
|
+
end
|
61
|
+
end
|
62
|
+
end
|
data/spec/maxmin_spec.rb
ADDED
@@ -0,0 +1,30 @@
|
|
1
|
+
describe "Max-min distance" do
|
2
|
+
|
3
|
+
before :all do
|
4
|
+
@u = [1, 3, 16]
|
5
|
+
@v = [1, 4, 16]
|
6
|
+
@w = [4, 5, 6]
|
7
|
+
end
|
8
|
+
|
9
|
+
it "accepts two arguments" do
|
10
|
+
expect { Measurable.maxmin(@u, @v) }.to_not raise_error
|
11
|
+
expect { Measurable.maxmin(@u, @v, @w) }.to raise_error(ArgumentError)
|
12
|
+
end
|
13
|
+
|
14
|
+
it "should be symmetric" do
|
15
|
+
x = Measurable.maxmin(@u, @v)
|
16
|
+
y = Measurable.maxmin(@v, @u)
|
17
|
+
|
18
|
+
x.should be_within(TOLERANCE).of(y)
|
19
|
+
end
|
20
|
+
|
21
|
+
it "should return the correct value" do
|
22
|
+
x = Measurable.maxmin(@u, @v)
|
23
|
+
|
24
|
+
x.should be_within(TOLERANCE).of(0.9523809523)
|
25
|
+
end
|
26
|
+
|
27
|
+
it "shouldn't work with vectors of different length" do
|
28
|
+
expect { Measurable.maxmin(@u, [1, 3, 5, 7]) }.to raise_error
|
29
|
+
end
|
30
|
+
end
|
data/spec/spec_helper.rb
CHANGED
@@ -0,0 +1,30 @@
|
|
1
|
+
describe "Tanimoto distance" do
|
2
|
+
|
3
|
+
before :all do
|
4
|
+
@u = [1, 0, 1]
|
5
|
+
@v = [1, 1, 1]
|
6
|
+
@w = [0, 1, 0]
|
7
|
+
end
|
8
|
+
|
9
|
+
it "accepts two arguments" do
|
10
|
+
expect { Measurable.tanimoto(@u, @v) }.to_not raise_error
|
11
|
+
expect { Measurable.tanimoto(@u, @v, @w) }.to raise_error(ArgumentError)
|
12
|
+
end
|
13
|
+
|
14
|
+
it "should be symmetric" do
|
15
|
+
x = Measurable.tanimoto(@u, @v)
|
16
|
+
y = Measurable.tanimoto(@v, @u)
|
17
|
+
|
18
|
+
x.should be_within(TOLERANCE).of(y)
|
19
|
+
end
|
20
|
+
|
21
|
+
it "should return the correct value" do
|
22
|
+
x = Measurable.tanimoto(@u, @v)
|
23
|
+
|
24
|
+
x.should be_within(TOLERANCE).of(-Math.log2(2.0 / 3.0))
|
25
|
+
end
|
26
|
+
|
27
|
+
it "shouldn't work with vectors of different length" do
|
28
|
+
expect { Measurable.tanimoto(@u, [1, 3, 5, 7]) }.to raise_error
|
29
|
+
end
|
30
|
+
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: measurable
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
4
|
+
version: 0.0.5
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Carlos Agarie
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2013-
|
11
|
+
date: 2013-07-24 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
@@ -74,8 +74,13 @@ files:
|
|
74
74
|
- lib/measurable/tanimoto.rb
|
75
75
|
- lib/measurable/version.rb
|
76
76
|
- measurable.gemspec
|
77
|
-
- spec/
|
77
|
+
- spec/cosine_spec.rb
|
78
|
+
- spec/euclidean_spec.rb
|
79
|
+
- spec/haversine_spec.rb
|
80
|
+
- spec/jaccard_spec.rb
|
81
|
+
- spec/maxmin_spec.rb
|
78
82
|
- spec/spec_helper.rb
|
83
|
+
- spec/tanimoto_spec.rb
|
79
84
|
homepage: http://github.com/agarie/measurable
|
80
85
|
licenses: []
|
81
86
|
metadata: {}
|
@@ -95,10 +100,15 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
95
100
|
version: '0'
|
96
101
|
requirements: []
|
97
102
|
rubyforge_project:
|
98
|
-
rubygems_version: 2.0.
|
103
|
+
rubygems_version: 2.0.3
|
99
104
|
signing_key:
|
100
105
|
specification_version: 4
|
101
106
|
summary: A Ruby gem with a lot of distance measures for your projects.
|
102
107
|
test_files:
|
103
|
-
- spec/
|
108
|
+
- spec/cosine_spec.rb
|
109
|
+
- spec/euclidean_spec.rb
|
110
|
+
- spec/haversine_spec.rb
|
111
|
+
- spec/jaccard_spec.rb
|
112
|
+
- spec/maxmin_spec.rb
|
104
113
|
- spec/spec_helper.rb
|
114
|
+
- spec/tanimoto_spec.rb
|
data/spec/measurable_spec.rb
DELETED
@@ -1,159 +0,0 @@
|
|
1
|
-
describe Measurable do
|
2
|
-
|
3
|
-
describe "Binary union" do
|
4
|
-
|
5
|
-
end
|
6
|
-
|
7
|
-
describe "Binary intersection" do
|
8
|
-
|
9
|
-
end
|
10
|
-
|
11
|
-
describe "Euclidean" do
|
12
|
-
|
13
|
-
before :all do
|
14
|
-
@u = [1, 3, 16]
|
15
|
-
@v = [1, 4, 16]
|
16
|
-
@w = [4, 5, 6]
|
17
|
-
end
|
18
|
-
|
19
|
-
context "Distance" do
|
20
|
-
it "accepts two arguments" do
|
21
|
-
expect { Measurable.euclidean(@u, @v) }.to_not raise_error
|
22
|
-
expect { Measurable.euclidean(@u, @v, @w) }.to raise_error(ArgumentError)
|
23
|
-
end
|
24
|
-
|
25
|
-
it "accepts one argument and returns the vector's norm" do
|
26
|
-
# Remember that 3^2 + 4^2 = 5^2.
|
27
|
-
Measurable.euclidean([3, 4]).should == 5
|
28
|
-
end
|
29
|
-
|
30
|
-
it "should be symmetric" do
|
31
|
-
Measurable.euclidean(@u, @v).should == Measurable.euclidean(@v, @u)
|
32
|
-
end
|
33
|
-
|
34
|
-
it "should return the correct value" do
|
35
|
-
Measurable.euclidean(@u, @u).should == 0
|
36
|
-
Measurable.euclidean(@u, @v).should == 1
|
37
|
-
end
|
38
|
-
|
39
|
-
it "shouldn't work with vectors of different length" do
|
40
|
-
expect { Measurable.euclidean(@u, [2, 2, 2, 2]) }.to raise_error
|
41
|
-
end
|
42
|
-
end
|
43
|
-
|
44
|
-
context "Squared Distance" do
|
45
|
-
it "accepts two arguments" do
|
46
|
-
expect { Measurable.euclidean_squared(@u, @v) }.to_not raise_error
|
47
|
-
expect { Measurable.euclidean_squared(@u, @v, @w) }.to raise_error(ArgumentError)
|
48
|
-
end
|
49
|
-
|
50
|
-
it "accepts one argument and returns the vector's norm" do
|
51
|
-
# Remember that 3^2 + 4^2 = 5^2.
|
52
|
-
Measurable.euclidean_squared([3, 4]).should == 25
|
53
|
-
end
|
54
|
-
|
55
|
-
it "should be symmetric" do
|
56
|
-
x = Measurable.euclidean_squared(@u, @v)
|
57
|
-
y = Measurable.euclidean_squared(@v, @u)
|
58
|
-
|
59
|
-
x.should == y
|
60
|
-
end
|
61
|
-
|
62
|
-
it "should return the correct value" do
|
63
|
-
Measurable.euclidean_squared(@u, @u).should == 0
|
64
|
-
Measurable.euclidean_squared(@u, @v).should == 1
|
65
|
-
end
|
66
|
-
|
67
|
-
it "shouldn't work with vectors of different length" do
|
68
|
-
expect { Measurable.euclidean_squared(@u, [2, 2, 2, 2]) }.to raise_error
|
69
|
-
end
|
70
|
-
end
|
71
|
-
|
72
|
-
end
|
73
|
-
|
74
|
-
describe "Cosine distance" do
|
75
|
-
it "accepts two arguments"
|
76
|
-
|
77
|
-
it "accepts one argument and returns the vector's norm"
|
78
|
-
|
79
|
-
it "should handle NaN's"
|
80
|
-
|
81
|
-
it "should be symmetric"
|
82
|
-
|
83
|
-
it "should return the correct value"
|
84
|
-
|
85
|
-
it "shouldn't work with vectors of different length"
|
86
|
-
end
|
87
|
-
|
88
|
-
describe "Chebyshev distance" do
|
89
|
-
it "accepts two arguments"
|
90
|
-
|
91
|
-
it "accepts one argument and returns the vector's norm"
|
92
|
-
|
93
|
-
it "should be symmetric"
|
94
|
-
|
95
|
-
it "should return the correct value"
|
96
|
-
|
97
|
-
it "shouldn't work with vectors of different length"
|
98
|
-
end
|
99
|
-
|
100
|
-
describe "Tanimoto distance" do
|
101
|
-
it "accepts two arguments"
|
102
|
-
|
103
|
-
it "accepts one argument and returns the vector's norm"
|
104
|
-
|
105
|
-
it "should be symmetric"
|
106
|
-
|
107
|
-
it "should return the correct value"
|
108
|
-
|
109
|
-
it "shouldn't work with vectors of different length"
|
110
|
-
end
|
111
|
-
|
112
|
-
describe "Haversine distance" do
|
113
|
-
it "accepts two arguments"
|
114
|
-
|
115
|
-
it "accepts one argument and returns the vector's norm"
|
116
|
-
|
117
|
-
it "should be symmetric"
|
118
|
-
|
119
|
-
it "should return the correct value"
|
120
|
-
|
121
|
-
it "shouldn't work with vectors of different length"
|
122
|
-
end
|
123
|
-
|
124
|
-
describe "Jaccard distance" do
|
125
|
-
it "accepts two arguments"
|
126
|
-
|
127
|
-
it "accepts one argument and returns the vector's norm"
|
128
|
-
|
129
|
-
it "should be symmetric"
|
130
|
-
|
131
|
-
it "should return the correct value"
|
132
|
-
|
133
|
-
it "shouldn't work with vectors of different length"
|
134
|
-
end
|
135
|
-
|
136
|
-
describe "Binary Jaccard distance" do
|
137
|
-
it "accepts two arguments"
|
138
|
-
|
139
|
-
it "accepts one argument and returns the vector's norm"
|
140
|
-
|
141
|
-
it "should be symmetric"
|
142
|
-
|
143
|
-
it "should return the correct value"
|
144
|
-
|
145
|
-
it "shouldn't work with vectors of different length"
|
146
|
-
end
|
147
|
-
|
148
|
-
describe "Max-min distance" do
|
149
|
-
it "accepts two arguments"
|
150
|
-
|
151
|
-
it "accepts one argument and returns the vector's norm"
|
152
|
-
|
153
|
-
it "should be symmetric"
|
154
|
-
|
155
|
-
it "should return the correct value"
|
156
|
-
|
157
|
-
it "shouldn't work with vectors of different length"
|
158
|
-
end
|
159
|
-
end
|