measurable 0.0.1 → 0.0.3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: c923bf9e9bd70c37d84330fcbb9d883f72344b04
4
+ data.tar.gz: bea042df8b59927f38b7ace662f7f8bb41f3f33a
5
+ SHA512:
6
+ metadata.gz: 5ce3eaec6a905c087b6538baf92172e0099b149eaf206b6af456645d6bc6e9b3b3975566de4c44a8dc84ef0c7dbc6cd11a53ce40d7f8de1b404540b37fa12c52
7
+ data.tar.gz: a9d1eb70f7c8b1e13878f23b3001a0eabfbbbe6fb59bf9b56704b6f3e8672ea6cf0954a92d8556219eb6a8ab6f79c7489dede6b82c33230bab821b56ec63ef71
data/.gitignore CHANGED
@@ -1,3 +1,4 @@
1
1
  pkg
2
2
  tmp/*
3
- benchmarks/*
3
+ benchmarks/*
4
+ lib/*.bundle
data/.rspec ADDED
@@ -0,0 +1 @@
1
+ --color
data/Gemfile.lock CHANGED
@@ -1,15 +1,13 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- measurables (0.0.1)
4
+ measurable (0.0.3)
5
5
 
6
6
  GEM
7
7
  remote: http://rubygems.org/
8
8
  specs:
9
9
  diff-lcs (1.1.3)
10
10
  rake (0.9.2.2)
11
- rake-compiler (0.8.1)
12
- rake
13
11
  rspec (2.9.0)
14
12
  rspec-core (~> 2.9.0)
15
13
  rspec-expectations (~> 2.9.0)
@@ -24,7 +22,6 @@ PLATFORMS
24
22
 
25
23
  DEPENDENCIES
26
24
  bundler
27
- measurables!
25
+ measurable!
28
26
  rake (~> 0.9)
29
- rake-compiler (~> 0.8.1)
30
27
  rspec (~> 2.9.0)
data/LICENSE CHANGED
@@ -1,4 +1,6 @@
1
- Copyright (c) 2012 Carlos Agarie
1
+ MIT License
2
+
3
+ Copyright (c) 2012-2013 Carlos Agarie <carlos@onox.com.br>
2
4
 
3
5
  Permission is hereby granted, free of charge, to any person obtaining
4
6
  a copy of this software and associated documentation files (the
data/README.md CHANGED
@@ -1,13 +1,39 @@
1
1
  # Measurable
2
2
 
3
- This (soon to be) gem encompasses various distance measures to be used in different projects. I want to support both the built-in `Array` class and [NMatrix](http://github.com/sciruby/nmatrix)'s `NVector`.
3
+ This gem encompasses various distance measures. Besides the `Array` class, I also want to support [NMatrix](http://github.com/sciruby/nmatrix)'s `NVector`.
4
+
5
+ My objective is to be able to compare different metrics just by changing which method is called. Also, to show how to use NMatrix's C API. I'll create most of the things in pure Ruby first, then the most used operations (or the slowest ones) will be rewritten in C.
4
6
 
5
7
  This is a fork of the gem [Distance Measure](https://github.com/reddavis/Distance-Measures), which has a similar objective, but isn't actively maintained and doesn't support NMatrix. Thank you, [reddavis](https://github.com/reddavis). :)
6
8
 
7
- # Install
9
+ ## Install
10
+
11
+ `gem install measurable`
12
+
13
+ It only works with Ruby MRI 1.9.3 or 2.0.0. I still want to test it on JRuby, but as its still pure Ruby, it should work correctly there.
14
+
15
+ ## Distance measures that I want to support for the moment
16
+
17
+ - Euclidean distance
18
+ - Squared euclidean distance
19
+ - Cosine distance
20
+ - Max-min distance (["K-Means clustering using max-min distance measure"][1])
21
+ - Jaccard distance
22
+ - Tanimoto distance
23
+
24
+ These still need to be implemented:
25
+
26
+ - Cityblock distance
27
+ - Chebyshev distance
28
+ - Minkowski distance
29
+ - Hamming distance
30
+ - Correlation distance
31
+ - Chi-square distance
32
+ - Kullback-Leibler divergence
33
+ - Jensen-Shannon divergence
34
+ - Mahalanobis distance
35
+ - Squared Mahalanobis distance
8
36
 
9
- I'll update this section when I publish the gem. For now... wait.
10
-
11
37
  ## How to use
12
38
 
13
39
  This list will be updated as I have time. I'll refactor the existing measures and add some that I'll need in a project.
@@ -20,54 +46,20 @@ require "measurable"
20
46
  u = NVector.ones(2)
21
47
  v = NVector.zeros(2)
22
48
  w = [1, 0]
49
+ x = [2, 2]
23
50
 
24
51
  Measurable::euclidean(u, v) # => 1.41421
25
52
  Measurable::euclidean(w, v) # => 1.00000
26
53
  Measurable::euclidean(w, w) # => 0.00000
54
+ Measurable::
27
55
  ```
28
56
 
29
- Maybe add some support for some of NMatrix's dtypes, like `:float32`, `:float64`, `:complex64`, `:complex128`, etc.
30
-
31
- ## How to use, the old way:
32
-
33
- a = [1,1]
34
- b = [2,2]
35
-
36
- a.euclidean_distance(b)
37
-
38
- a.cosine_similarity(b)
39
-
40
- a.jaccard_index(b)
41
-
42
- a.jaccard_distance(b)
43
-
44
- a.binary_jaccard_index(b)
45
-
46
- a.binary_jaccard_distance(b)
47
-
48
- a.tanimoto_coefficient(b)
49
-
50
- a.haversine_distance(b)
51
-
52
- This may or may not be the complete list, best thing is to check the source code.
53
-
54
- There are also a couple bonus methods:
55
-
56
- a.dot_product(b)
57
-
58
- a.sum_of_squares
59
-
60
- a.intersection_with(b)
61
-
62
- a.union_with(b)
63
-
64
- # When your dealing with 1's and 0's
65
- a.binary_intersection_with(b)
66
-
67
- a.binary_union_with(b)
57
+ Maybe add support for (some of) NMatrix's dtypes, like `:float32`, `:float64`, `:complex64`, `:complex128`, etc. This will have to way until Measurable supports NMatrix C API.
68
58
 
69
59
  ## License
70
60
 
71
- Copyright (c) 2012 Carlos Agarie. See LICENSE for details.
61
+ See LICENSE for details.
62
+
63
+ The original `Distance Measure` gem is copyrighted by @reddavis.
72
64
 
73
- The original `Distance Measure` gem is copyrighted by reddavis 2010.
65
+ [1]: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=05156398
data/Rakefile CHANGED
@@ -1,5 +1,5 @@
1
1
  require 'rake'
2
- require "rake/extensiontask"
2
+ require 'bundler/gem_tasks'
3
3
 
4
4
  # Setup the necessary gems, specified in the gemspec.
5
5
  require 'bundler'
@@ -12,9 +12,9 @@ rescue Bundler::BundlerError => e
12
12
  end
13
13
 
14
14
  # Compile task.
15
- Rake::ExtensionTask.new do |ext|
16
- ext.name = 'measurable'
17
- ext.ext_dir = 'ext/measurable'
18
- ext.lib_dir = 'lib/'
19
- ext.source_pattern = "**/*.{c, cpp, h}"
20
- end
15
+ # Rake::ExtensionTask.new do |ext|
16
+ # ext.name = 'measurable'
17
+ # ext.ext_dir = 'ext/measurable'
18
+ # ext.lib_dir = 'lib/'
19
+ # ext.source_pattern = "**/*.{c, cpp, h}"
20
+ # end
data/lib/measurable.rb CHANGED
@@ -1,32 +1,55 @@
1
- $:.unshift(File.dirname(__FILE__) + "/../lib")
1
+ require 'measurable/version.rb'
2
2
 
3
- require "measurable/version.rb"
3
+ # Distance measures.
4
+ reqiore 'measurable/euclidean'
5
+ require 'measurable/cosine'
6
+ require 'measurable/tanimoto_coefficient'
7
+ require 'measurable/jaccard'
8
+ require 'measurable/haversine'
4
9
 
5
- require "measurable/cosine_similarity"
6
- require "measurable/tanimoto_coefficient"
7
- require "measurable/jaccard"
8
- require "measurable/haversine"
9
-
10
- require "measurable.so"
11
-
12
- class Array
13
- include Measurable
10
+ module Measurable
11
+ # PI = 3.1415926535
12
+ RAD_PER_DEG = 0.017453293 # PI/180
14
13
 
15
14
  # http://en.wikipedia.org/wiki/Intersection_(set_theory)
16
- def intersection_with(other)
17
- (self & other)
15
+ def intersection(u, v)
16
+ (u & v)
18
17
  end
19
18
 
20
19
  # http://en.wikipedia.org/wiki/Union_(set_theory)
21
- def union_with(other)
22
- (self + other).uniq
20
+ def union(u, v)
21
+ (u + v).uniq
23
22
  end
24
23
 
25
- private
24
+ def binary_union(u, v)
25
+ unions = []
26
+ u.each_with_index do |n, index|
27
+ if n == 1 || v[index] == 1
28
+ unions << 1
29
+ else
30
+ unions << 0
31
+ end
32
+ end
33
+
34
+ unions
35
+ end
36
+
37
+ def binary_intersection(u, v)
38
+ intersects = []
39
+ u.each_with_index do |n, index|
40
+ if n == 1 && v[index] == 1
41
+ intersects << 1
42
+ else
43
+ intersects << 0
44
+ end
45
+ end
46
+
47
+ intersects
48
+ end
26
49
 
27
50
  # Checks if we"re dealing with NaN"s and will return 0.0 unless
28
51
  # handle NaN"s is set to false
29
52
  def handle_nan(result)
30
53
  result.nan? ? 0.0 : result
31
54
  end
32
- end
55
+ end
@@ -1,6 +1,6 @@
1
1
  module Measurable
2
- def self.cosine_similarity(other)
3
- dot_product = self.dot_product(other)
2
+ def cosine(u, v)
3
+ dot_product = dot(u, v)
4
4
  normalization = self.euclidean_normalize * other.euclidean_normalize
5
5
 
6
6
  handle_nan(dot_product / normalization)
@@ -0,0 +1,17 @@
1
+ module Measurable
2
+ def euclidean(u, v)
3
+ sum = 0.0
4
+
5
+ u.zip(v).each do |ary|
6
+ sum += (ary.first - ary.last)**2
7
+ end
8
+
9
+ Math.sqrt(sum)
10
+ end
11
+
12
+ def euclidean_squared(u, v)
13
+ u.zip(v).reduce(0.0) do |acc, ary|
14
+ acc += (ary.first - ary.last)**2
15
+ end
16
+ end
17
+ end
@@ -1,19 +1,17 @@
1
- #
2
1
  # Notes:
3
2
  #
4
3
  # translated into Ruby based on information contained in:
5
- # http://mathforum.org/library/drmath/view/51879.html Doctors Rick and Peterson - 4/20/99
6
- # http://www.movable-type.co.uk/scripts/latlong.html
7
- # http://en.wikipedia.org/wiki/Haversine_formula
4
+ # http://mathforum.org/library/drmath/view/51879.html
5
+ # Dr. Rick and Dr. Peterson - 4/20/99
8
6
  #
9
- # This formula can compute accurate distances between two points given latitude and longitude, even for
10
- # short distances.
7
+ # http://www.movable-type.co.uk/scripts/latlong.html
8
+ # http://en.wikipedia.org/wiki/Haversine_formula
9
+ #
10
+ # This formula can compute accurate distances between two points given latitude
11
+ # and longitude, even for short distances.
11
12
 
12
13
  module Measurable
13
14
 
14
- # PI = 3.1415926535
15
- RAD_PER_DEG = 0.017453293 # PI/180
16
-
17
15
  R_MILES = 3956 # radius of the great circle in miles
18
16
  R_KM = 6371 # radius in kilometers...some algorithms use 6367
19
17
 
@@ -25,18 +23,18 @@ module Measurable
25
23
  :meters => R_KM * 1000
26
24
  }
27
25
 
28
- def haversine_distance(other, um = :meters)
29
- dlon = other[1] - self[1]
30
- dlat = other[0] - self[0]
26
+ def haversine(u, v, um = :meters)
27
+ dlon = u[1] - v[1]
28
+ dlat = u[0] - v[0]
31
29
 
32
30
  dlon_rad = dlon * RAD_PER_DEG
33
31
  dlat_rad = dlat * RAD_PER_DEG
34
32
 
35
- lat1_rad = self[0] * RAD_PER_DEG
36
- lon1_rad = self[1] * RAD_PER_DEG
33
+ lat1_rad = v[0] * RAD_PER_DEG
34
+ lon1_rad = v[1] * RAD_PER_DEG
37
35
 
38
- lat2_rad = other[0] * RAD_PER_DEG
39
- lon2_rad = other[1] * RAD_PER_DEG
36
+ lat2_rad = u[0] * RAD_PER_DEG
37
+ lon2_rad = u[1] * RAD_PER_DEG
40
38
 
41
39
  a = (Math.sin(dlat_rad/2))**2 + Math.cos(lat1_rad) * Math.cos(lat2_rad) * (Math.sin(dlon_rad/2))**2
42
40
  c = 2 * Math.atan2( Math.sqrt(a), Math.sqrt(1-a))
@@ -1,26 +1,26 @@
1
1
  # http://en.wikipedia.org/wiki/Jaccard_coefficient
2
2
  module Measurable
3
3
 
4
- def jaccard_distance(other)
5
- 1 - self.jaccard_index(other)
4
+ def jaccard(u, v)
5
+ 1 - jaccard_index(u, v)
6
6
  end
7
7
 
8
- def jaccard_index(other)
9
- union = (self + other).uniq.size.to_f
10
- intersection = self.intersection_with(other).size.to_f
8
+ def jaccard_index(u, v)
9
+ union = (u + v).uniq.size.to_f
10
+ i = intersection(u, v).size.to_f
11
11
 
12
- intersection / union
12
+ i / union
13
13
  end
14
14
 
15
- def binary_jaccard_distance(other)
16
- 1 - self.binary_jaccard_index(other)
15
+ def binary_jaccard(u, v)
16
+ 1 - binary_jaccard_index(u, v)
17
17
  end
18
18
 
19
- def binary_jaccard_index(other)
20
- intersection = self.binary_intersection_with(other).delete_if {|x| x == 0}.size.to_f
21
- union = self.binary_union_with(other).delete_if {|x| x == 0}.size.to_f
19
+ def binary_jaccard_index(u, v)
20
+ i = binary_intersection(u, v).delete_if {|x| x == 0}.size.to_f
21
+ union = binary_union(u, v).delete_if {|x| x == 0}.size.to_f
22
22
 
23
- intersection / union
23
+ i / union
24
24
  end
25
25
 
26
26
  end
@@ -1,8 +1,8 @@
1
1
  # http://en.wikipedia.org/wiki/Jaccard_index#Tanimoto_coefficient_.28extended_Jaccard_coefficient.29
2
2
  module Measurable
3
- def tanimoto_coefficient(other)
4
- dot = self.dot_product(other).to_f
5
- result = dot / (self.sum_of_squares + other.sum_of_squares - dot).to_f
3
+ def tanimoto(u, v)
4
+ dot = dot(u, v).to_f
5
+ result = dot / (u.sum_of_squares + v.sum_of_squares - dot).to_f
6
6
 
7
7
  handle_nan(result)
8
8
  end
@@ -1,3 +1,3 @@
1
1
  module Measurable
2
- VERSION = "0.0.1"
2
+ VERSION = "0.0.3"
3
3
  end
data/measurable.gemspec CHANGED
@@ -1,30 +1,28 @@
1
- lib = File.expand_path('../lib/', __FILE__)
2
- $:.unshift lib unless $:.include?(lib)
1
+ $:.unshift File.expand_path('../lib/', __FILE__)
3
2
 
4
3
  require 'measurable/version'
4
+ require 'date'
5
5
 
6
6
  Gem::Specification.new do |gem|
7
7
  gem.name = "measurable"
8
8
  gem.version = Measurable::VERSION
9
9
  gem.date = Date.today.to_s
10
- gem.summary = %Q{A Ruby module with a lot of distance measures for your projects.}
11
- gem.description = %Q{A Ruby module with a lot of distance measures for your projects.}
10
+ gem.summary = %Q{A Ruby gem with a lot of distance measures for your projects.}
11
+ gem.description = %Q{A Ruby gem with a lot of distance measures for your projects.}
12
12
 
13
13
  gem.authors = ["Carlos Agarie"]
14
- gem.email = "carlos@onox.com.br"
14
+ gem.email = "carlos.agarie@gmail.com"
15
15
  gem.homepage = "http://github.com/agarie/measurable"
16
16
 
17
17
  gem.files = `git ls-files`.split("\n")
18
18
  gem.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
19
19
  gem.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
20
20
 
21
- gem.require_paths = ["lib"]
22
- gem.extensions = ['ext/measurable/extconf.rb']
21
+ gem.require_paths = ["lib"]
23
22
 
24
- gem.required_ruby_version = '>= 1.9.2'
23
+ gem.required_ruby_version = '>= 1.9.3'
25
24
 
26
25
  gem.add_development_dependency 'bundler'
27
26
  gem.add_development_dependency 'rake', '~> 0.9'
28
- gem.add_development_dependency 'rake-compiler', '~> 0.8.1'
29
27
  gem.add_development_dependency 'rspec', '~> 2.9.0'
30
28
  end
@@ -0,0 +1,69 @@
1
+ describe Measurable do
2
+
3
+ let(:u) { [1, 3, 16] }
4
+ let(:v) { [1, 4, 16] }
5
+ let(:w) { [4, 5, 6] }
6
+
7
+ describe "Euclidean distance" do
8
+ it "accepts two arguments" do
9
+ expect { Measurable::euclidean(:u) }.to raise_error(ArgumentError)
10
+ expect { Measurable::euclidean(:u, :v) }.to_not raise_error(ArgumentError)
11
+ expect { Measurable::euclidean(:u, :v, :w) }.to raise_error(ArgumentError)
12
+ end
13
+
14
+ it "accepts one argument and returns the vector's norm"
15
+
16
+ it "should be symmetric"
17
+
18
+ it "should return the correct value" do
19
+ Measurable::euclidean(:u, :u).should == 0
20
+ euclidean(:u, :v).should == 1
21
+ end
22
+
23
+ it "shouldn't work with vectors of different length" do
24
+ expect { Measurable::euclidean(:u, [2, 2, 2, 2]) }.to raise_error(DiffLengthError)
25
+ end
26
+ end
27
+
28
+ describe "Binary union" do
29
+
30
+ describe "Binary intersection" do
31
+
32
+ describe "Cosine similarity measure" do
33
+ it "accepts two arguments"
34
+
35
+ it "accepts one argument and returns the vector's norm"
36
+
37
+ it "should handle NaN's"
38
+
39
+ it "should be symmetric"
40
+
41
+ it "should return the correct value"
42
+
43
+ it "shouldn't work with vectors of different length"
44
+ end
45
+
46
+ describe "Chebyshev distance" do
47
+ it "accepts two arguments"
48
+
49
+ it "accepts one argument and returns the vector's norm"
50
+
51
+ it "should be symmetric"
52
+
53
+ it "should return the correct value"
54
+
55
+ it "shouldn't work with vectors of different length"
56
+ end
57
+
58
+ describe "Max-min similarity measure" do
59
+ it "accepts two arguments"
60
+
61
+ it "accepts one argument and returns the vector's norm"
62
+
63
+ it "should be symmetric"
64
+
65
+ it "should return the correct value"
66
+
67
+ it "shouldn't work with vectors of different length"
68
+ end
69
+ end
data/spec/spec_helper.rb CHANGED
@@ -2,8 +2,3 @@ $LOAD_PATH.unshift(File.dirname(__FILE__))
2
2
  $LOAD_PATH.unshift(File.join(File.dirname(__FILE__), '..', 'lib'))
3
3
 
4
4
  require 'measurable'
5
- require 'spec'
6
- require 'spec/autorun'
7
-
8
- Spec::Runner.configure do |config|
9
- end
metadata CHANGED
@@ -1,36 +1,32 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: measurable
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
5
- prerelease:
4
+ version: 0.0.3
6
5
  platform: ruby
7
6
  authors:
8
7
  - Carlos Agarie
9
8
  autorequire:
10
9
  bindir: bin
11
10
  cert_chain: []
12
- date: 2012-10-09 00:00:00.000000000 Z
11
+ date: 2013-03-24 00:00:00.000000000 Z
13
12
  dependencies:
14
13
  - !ruby/object:Gem::Dependency
15
14
  name: bundler
16
15
  requirement: !ruby/object:Gem::Requirement
17
- none: false
18
16
  requirements:
19
- - - ! '>='
17
+ - - '>='
20
18
  - !ruby/object:Gem::Version
21
19
  version: '0'
22
20
  type: :development
23
21
  prerelease: false
24
22
  version_requirements: !ruby/object:Gem::Requirement
25
- none: false
26
23
  requirements:
27
- - - ! '>='
24
+ - - '>='
28
25
  - !ruby/object:Gem::Version
29
26
  version: '0'
30
27
  - !ruby/object:Gem::Dependency
31
28
  name: rake
32
29
  requirement: !ruby/object:Gem::Requirement
33
- none: false
34
30
  requirements:
35
31
  - - ~>
36
32
  - !ruby/object:Gem::Version
@@ -38,31 +34,13 @@ dependencies:
38
34
  type: :development
39
35
  prerelease: false
40
36
  version_requirements: !ruby/object:Gem::Requirement
41
- none: false
42
37
  requirements:
43
38
  - - ~>
44
39
  - !ruby/object:Gem::Version
45
40
  version: '0.9'
46
- - !ruby/object:Gem::Dependency
47
- name: rake-compiler
48
- requirement: !ruby/object:Gem::Requirement
49
- none: false
50
- requirements:
51
- - - ~>
52
- - !ruby/object:Gem::Version
53
- version: 0.8.1
54
- type: :development
55
- prerelease: false
56
- version_requirements: !ruby/object:Gem::Requirement
57
- none: false
58
- requirements:
59
- - - ~>
60
- - !ruby/object:Gem::Version
61
- version: 0.8.1
62
41
  - !ruby/object:Gem::Dependency
63
42
  name: rspec
64
43
  requirement: !ruby/object:Gem::Requirement
65
- none: false
66
44
  requirements:
67
45
  - - ~>
68
46
  - !ruby/object:Gem::Version
@@ -70,59 +48,56 @@ dependencies:
70
48
  type: :development
71
49
  prerelease: false
72
50
  version_requirements: !ruby/object:Gem::Requirement
73
- none: false
74
51
  requirements:
75
52
  - - ~>
76
53
  - !ruby/object:Gem::Version
77
54
  version: 2.9.0
78
- description: A Ruby module with a lot of distance measures for your projects.
79
- email: carlos@onox.com.br
55
+ description: A Ruby gem with a lot of distance measures for your projects.
56
+ email: carlos.agarie@gmail.com
80
57
  executables: []
81
- extensions:
82
- - ext/measurable/extconf.rb
58
+ extensions: []
83
59
  extra_rdoc_files: []
84
60
  files:
85
61
  - .gitignore
62
+ - .rspec
86
63
  - Gemfile
87
64
  - Gemfile.lock
88
65
  - LICENSE
89
66
  - README.md
90
67
  - Rakefile
91
- - ext/measurable/extconf.rb
92
- - ext/measurable/measurable.c
93
68
  - lib/measurable.rb
94
- - lib/measurable/cosine_similarity.rb
69
+ - lib/measurable/cosine.rb
70
+ - lib/measurable/euclidean.rb
95
71
  - lib/measurable/haversine.rb
96
72
  - lib/measurable/jaccard.rb
97
- - lib/measurable/tanimoto_coefficient.rb
73
+ - lib/measurable/tanimoto.rb
98
74
  - lib/measurable/version.rb
99
75
  - measurable.gemspec
100
- - spec/measurable.rb
76
+ - spec/measurable_spec.rb
101
77
  - spec/spec_helper.rb
102
78
  homepage: http://github.com/agarie/measurable
103
79
  licenses: []
80
+ metadata: {}
104
81
  post_install_message:
105
82
  rdoc_options: []
106
83
  require_paths:
107
84
  - lib
108
85
  required_ruby_version: !ruby/object:Gem::Requirement
109
- none: false
110
86
  requirements:
111
- - - ! '>='
87
+ - - '>='
112
88
  - !ruby/object:Gem::Version
113
- version: 1.9.2
89
+ version: 1.9.3
114
90
  required_rubygems_version: !ruby/object:Gem::Requirement
115
- none: false
116
91
  requirements:
117
- - - ! '>='
92
+ - - '>='
118
93
  - !ruby/object:Gem::Version
119
94
  version: '0'
120
95
  requirements: []
121
96
  rubyforge_project:
122
- rubygems_version: 1.8.24
97
+ rubygems_version: 2.0.0
123
98
  signing_key:
124
- specification_version: 3
125
- summary: A Ruby module with a lot of distance measures for your projects.
99
+ specification_version: 4
100
+ summary: A Ruby gem with a lot of distance measures for your projects.
126
101
  test_files:
127
- - spec/measurable.rb
102
+ - spec/measurable_spec.rb
128
103
  - spec/spec_helper.rb
@@ -1,5 +0,0 @@
1
- require "mkmf"
2
-
3
- dir_config("measurable")
4
-
5
- create_makefile("measurable")
@@ -1,209 +0,0 @@
1
- #include <ruby.h>
2
- #include <math.h>
3
-
4
- #ifndef RUBY_19
5
- #ifndef RARRAY_LEN
6
- #define RARRAY_LEN(v) (RARRAY(v)->len)
7
- #endif
8
- #ifndef RARRAY_PTR
9
- #define RARRAY_PTR(v) (RARRAY(v)->ptr)
10
- #endif
11
- #endif
12
-
13
- /*
14
- ** def euclidean_distance(other)
15
- ** sum = 0.0
16
- ** self.each_index do |i|
17
- ** sum += (self[i] - other[i])**2
18
- ** end
19
- ** Math.sqrt(sum)
20
- ** end
21
- */
22
-
23
- static VALUE rb_euclidean(VALUE self, VALUE other_array) {
24
- double value = 0.0;
25
-
26
- /* TODO: check they're the same size. */
27
- long vector_length = (RARRAY_LEN(self) - 1);
28
- int index;
29
-
30
- for (index = 0; index <= vector_length; index++) {
31
- double x, y;
32
-
33
- x = NUM2DBL(RARRAY_PTR(self)[index]);
34
- y = NUM2DBL(RARRAY_PTR(other_array)[index]);
35
-
36
- value += pow(x - y, 2);
37
- }
38
-
39
- return rb_float_new(sqrt(value));
40
- }
41
-
42
- /* Prototypes */
43
- long c_array_size(VALUE array);
44
-
45
- /*
46
- ** def dot_product(other)
47
- ** sum = 0.0
48
- ** self.each_with_index do |n, index|
49
- ** sum += n * other[index]
50
- ** end
51
- **
52
- ** sum
53
- ** end
54
- */
55
-
56
- static VALUE rb_dot_product(VALUE self, VALUE other_array) {
57
- double sum = 0;
58
-
59
- /* TODO check they're the same size. */
60
- long array_size = c_array_size(self);
61
- int index;
62
-
63
- for(index = 0; index <= array_size; index++) {
64
- double x, y;
65
-
66
- x = NUM2DBL(RARRAY_PTR(self)[index]);
67
- y = NUM2DBL(RARRAY_PTR(other_array)[index]);
68
-
69
- sum += x * y;
70
- }
71
-
72
- return rb_float_new(sum);
73
- }
74
-
75
- /*
76
- ** def sum_of_squares
77
- ** inject(0) {|sum, n| sum + n ** 2}
78
- ** end
79
- */
80
-
81
- static VALUE rb_sum_of_squares(VALUE self) {
82
- double sum = 0;
83
- long array_size = c_array_size(self);
84
- int index;
85
-
86
- for(index = 0; index <= array_size; index++) {
87
- double x;
88
-
89
- x = NUM2DBL(RARRAY_PTR(self)[index]);
90
-
91
- sum += pow(x, 2);
92
- }
93
-
94
- return rb_float_new(sum);
95
- }
96
-
97
- /*
98
- ** def euclidean_normalize
99
- ** sum = 0.0
100
- ** self.each do |n|
101
- ** sum += n ** 2
102
- ** end
103
- **
104
- ** Math.sqrt(sum)
105
- ** end
106
- */
107
-
108
- static VALUE rb_euclidean_normalize(VALUE self) {
109
- double sum = 0;
110
- long array_size = c_array_size(self);
111
- int index;
112
-
113
- for(index = 0; index <= array_size; index++) {
114
- double x;
115
-
116
- x = NUM2DBL(RARRAY_PTR(self)[index]);
117
-
118
- sum += pow(x, 2);
119
- }
120
-
121
- return rb_float_new(sqrt(sum));
122
- }
123
-
124
- /*
125
- ** def binary_union_with(other)
126
- ** unions = []
127
- ** self.each_with_index do |n, index|
128
- ** if n == 1 || other[index] == 1
129
- ** unions << 1
130
- ** else
131
- ** unions << 0
132
- ** end
133
- ** end
134
- **
135
- ** unions
136
- ** end
137
- */
138
-
139
- static VALUE rb_binary_union_with(VALUE self, VALUE other_array) {
140
- //TODO: check arrays are same size
141
- long array_size = c_array_size(self);
142
- int index;
143
- VALUE results = rb_ary_new();
144
-
145
- for(index = 0; index <= array_size; index++) {
146
- int self_attribute = NUM2INT(RARRAY_PTR(self)[index]);
147
- int other_array_attribute = NUM2INT(RARRAY_PTR(other_array)[index]);
148
-
149
- if(self_attribute == 1 || other_array_attribute == 1) {
150
- rb_ary_push(results, rb_int_new(1));
151
- } else {
152
- rb_ary_push(results, rb_int_new(0));
153
- }
154
- }
155
-
156
- return results;
157
- }
158
-
159
- /*
160
- ** def binary_intersection_with(other)
161
- ** intersects = []
162
- ** self.each_with_index do |n, index|
163
- ** if n == 1 && other[index] == 1
164
- ** intersects << 1
165
- ** else
166
- ** intersects << 0
167
- ** end
168
- ** end
169
- **
170
- ** intersects
171
- ** end
172
- */
173
-
174
- static VALUE rb_binary_intersection_with(VALUE self, VALUE other_array) {
175
- /* TODO check arrays are same size */
176
- long array_size = c_array_size(self);
177
- int index;
178
- VALUE results = rb_ary_new();
179
-
180
- for(index = 0; index <= array_size; index++) {
181
- int self_attribute = NUM2INT(RARRAY_PTR(self)[index]);
182
- int other_array_attribute = NUM2INT(RARRAY_PTR(other_array)[index]);
183
-
184
- if(self_attribute == 1 && other_array_attribute == 1) {
185
- rb_ary_push(results, rb_int_new(1));
186
- } else {
187
- rb_ary_push(results, rb_int_new(0));
188
- }
189
- }
190
-
191
- return results;
192
- }
193
-
194
- /* return the size of a Ruby array - 1 */
195
- long c_array_size(VALUE array) {
196
- return (RARRAY_LEN(array) - 1);
197
- }
198
-
199
- void
200
- Init_measurable()
201
- {
202
- VALUE rb_measurable = rb_define_module("Measurable");
203
- rb_define_method(rb_measurable, "euclidean", rb_euclidean, 1);
204
- rb_define_method(rb_measurable, "dot_product", rb_dot_product, 1);
205
- rb_define_method(rb_measurable, "sum_of_squares", rb_sum_of_squares, 0);
206
- rb_define_method(rb_measurable, "euclidean_normalize", rb_euclidean_normalize, 0);
207
- rb_define_method(rb_measurable, "binary_union_with", rb_binary_union_with, 1);
208
- rb_define_method(rb_measurable, "binary_intersection_with", rb_binary_intersection_with, 1);
209
- }
data/spec/measurable.rb DELETED
@@ -1,106 +0,0 @@
1
- require File.expand_path(File.dirname(__FILE__) + '/spec_helper')
2
-
3
- describe Measurable do
4
-
5
- let(:array) { [5, 5] }
6
- let(:array_2) { [7, 3, 2, 4, 1] }
7
- let(:array_3) { [4, 1, 9, 7, 5] }
8
-
9
- describe "Euclidean Distance" do
10
- it "should return 0.0" do
11
- array.euclidean_distance(array).should == 0.0
12
- end
13
-
14
- it "should return 4.0" do
15
- [5].euclidean_distance([1]).should == 4.0
16
- end
17
- end
18
-
19
- describe "Cosine Similarity" do
20
- it "should return 1.0" do
21
- array.cosine_similarity(array).should.to_s == "1.0" # WTF
22
- end
23
-
24
- it "should handle NaN's" do
25
- [0.0, 0.0].cosine_similarity([0.0, 0.0]).nan?.should be_false
26
- end
27
- end
28
-
29
- describe "Tanimoto Coefficient" do
30
- it "should return 1.0" do
31
- array.tanimoto_coefficient(array).should == 1.0
32
- end
33
-
34
- it "should handle NaN's" do
35
- [0.0, 0.0].tanimoto_coefficient([0.0, 0.0]).nan?.should be_false
36
- end
37
- end
38
-
39
- describe "Sum of Squares" do
40
- it "should return 50" do
41
- array.sum_of_squares.should == 50
42
- end
43
- end
44
-
45
- describe "Jaccard" do
46
- describe "Jaccard Distance" do
47
- it "should return" do
48
- array_2.jaccard_distance(array_3).should == (1 - 3.0/7.0)
49
- end
50
- end
51
-
52
- describe "Jaccard Index" do
53
- it "should return" do
54
- array_2.jaccard_index(array_3).should == 3.0/7.0
55
- end
56
- end
57
-
58
- describe "Binary Jaccard Index" do
59
- it "should return 1/4" do
60
- [1,1,1,1].binary_jaccard_index([0,1,0,0]).should == 1/4.0
61
- end
62
- end
63
- end
64
-
65
- describe "Binary Jaccard Distance" do
66
- it "should return 0.75" do
67
- [1,1,1,1].binary_jaccard_distance([0,1,0,0]).should == 1 - (1/4.0)
68
- end
69
- end
70
-
71
- describe "Intersection" do
72
- it "should return [7,4,1]" do
73
- array_2.intersection_with(array_3).should == [7,4,1]
74
- end
75
- end
76
-
77
- describe "Union" do
78
- it "should return " do
79
- array_2.union_with(array_3).should == [7,3,2,4,1,9,5]
80
- end
81
- end
82
-
83
- describe "Binary Intersection" do
84
- it "should return [0,1,0,0]" do
85
- [1,1,1,1].binary_intersection_with([0,1,0,0]).should == [0,1,0,0]
86
- end
87
- end
88
-
89
- describe "Binary Union" do
90
- it "should return [1,1,1,0]" do
91
- [1,1,1,0].binary_union_with([0,0,0,0]).should == [1,1,1,0]
92
- end
93
- end
94
-
95
- describe "Dot Product" do
96
- it "should return 50" do
97
- [5, 5].dot_product([5, 5]).should == 50.0
98
- end
99
- end
100
-
101
- describe "Euclidean normalize" do
102
- it "should" do
103
- [10].euclidean_normalize.should == 10
104
- end
105
- end
106
- end