fast_statistics 0.2.0 → 0.2.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f74e28e9d460ef4ebcab5b0efa21ef0bd7aa1bc17ec398292dbc90fb2f759c10
4
- data.tar.gz: 1ab07eeea3e7e139b0305b0d4109f780d2d7f372a0c3d8998ffc297a690f8eef
3
+ metadata.gz: 98882f948017b8d64fdd334bd0ea4e3475a9beb0e1817e766af4779a9f56aafe
4
+ data.tar.gz: e06600fef9d8026e559928bac3278b606a52a0c0f287320bfc9457d45a4ba7ff
5
5
  SHA512:
6
- metadata.gz: 7ce73ccecf92df45a42789a45dab9d00ea0642d5b08904e07ecb43eb75148c33f4d645737cf1999fd2b9b7931e80b5efc9f3882756b62820464481ae8a954bfe
7
- data.tar.gz: 8e9385b1776b36d9d45c402ccc3015ecc01c4c52a95fae4321f3f6409ebc050921192d2c049cb8b453c49a46defd899d25a84491dbb811f9e10ae3bc2caa2b60
6
+ metadata.gz: '0449c749047a282bc3aaebfeb1c84a5416cd98a727884f5153f25513b7750ad557b7c51cc51356245c7a28a2ed0d096638b3efd9a6a89801e4ad532386f7e7f7'
7
+ data.tar.gz: e3c2ceedb95e765e5e88dfb4ceddc36065f6040e2a066c03f480d75867f7c5763f559e73c26525ce493270cbdf3865cfe9f5e7a83076e85e752e21cb2fb12aad
data/README.md CHANGED
@@ -1,4 +1,4 @@
1
- # FastStatistics
1
+ # Fast Statistics :rocket:
2
2
  ![Build Status](https://travis-ci.com/Martin-Nyaga/fast_statistics.svg?branch=master)
3
3
 
4
4
  A high performance native ruby extension (written in C++) for computation of
@@ -9,6 +9,10 @@ This gem provides fast computation of descriptive statistics (min, max, mean,
9
9
  median, 1st and 3rd quartiles, population standard deviation) for a multivariate
10
10
  dataset (represented as a 2D array) in ruby.
11
11
 
12
+ It is **~11x** faster than an optimal algorithm in hand-written ruby, and
13
+ **~4.7x** faster than the next fastest available ruby gem or native extension
14
+ (see [benchmarks](#benchmarks) below).
15
+
12
16
  ## Installation
13
17
 
14
18
  Add this line to your application's Gemfile:
@@ -81,10 +85,6 @@ Some alternatives compared are:
81
85
  - [Numo::NArray](https://github.com/ruby-numo/numo-narray)
82
86
  - Hand-written ruby (using the same algorithm implemented in C++ in this gem)
83
87
 
84
- Benchmarked on my machine (8th gen i7, sse2), this gem is **~11x**
85
- faster than an optimal algorithm in hand-written ruby, and **~4.7x** faster than
86
- the next fastest available native ruby extension (that I tested).
87
-
88
88
  You can reivew the benchmark implementations at `benchmark/benchmark.rb` and run the
89
89
  benchmark with `rake benchmark`.
90
90
 
@@ -124,6 +124,15 @@ first explored performing the computations natively in [this
124
124
  repository](https://github.com/Martin-Nyaga/ruby-ffi-simd). The results were
125
125
  promising, so I decided to package it as a ruby gem.
126
126
 
127
+ **Note**: This is an early release and should be considered unstable, at least
128
+ until I'm confident in the stability & performance in a real world application
129
+ setting. Feel free to test it out in non-critical scenarios/environments (let
130
+ me know in [this discussion
131
+ thread](https://github.com/Martin-Nyaga/fast_statistics/discussions/1) or by
132
+ filing an issue if you use it!). I'm also not really an expert in C++, so
133
+ reviews & suggestions are welcome.
134
+
135
+ ### How is the performance achieved?
127
136
  The following factors combined help this gem achieve high performance compared
128
137
  to available native alternatives and hand-written computations in ruby:
129
138
 
@@ -139,20 +148,14 @@ to available native alternatives and hand-written computations in ruby:
139
148
  where possible, giving an additional speed advantage while still being single
140
149
  threaded.
141
150
 
142
- That said, there are some limitations in the current implementation:
151
+ ### Limitations of the current implementation
152
+ The speed gains notwithstanding, there are some limitations in the current implementation:
143
153
  - The variables in the 2D array must all have the same number of data points
144
154
  (inner arrays must have the same length) and contain only numbers (i.e. no
145
155
  `nil` awareness is present).
146
156
  - There is currently no API to calculate single statistics (although this may be
147
157
  made available in the future).
148
158
 
149
- This is an early release and should be considered unstable, at least until I'm
150
- confident in the stability & performance in a real world application setting
151
- (let me know in [the Welcome discussion
152
- thread](https://github.com/Martin-Nyaga/fast_statistics/discussions/1) if you
153
- use it!). I'm also not really an expert in C++, so reviews & suggestions are
154
- welcome.
155
-
156
159
  ## Contributing
157
160
 
158
161
  Bug reports and pull requests are welcome on GitHub at
data/Rakefile CHANGED
@@ -22,15 +22,3 @@ task :benchmark => [:clean, :compile] do
22
22
  bench.compare_results!
23
23
  bench.benchmark_ips!
24
24
  end
25
-
26
- task :profile => [:clean, :compile] do
27
- require "fast_statistics"
28
- $stdout.sync = true
29
-
30
- variables = 12
31
- length = 100_000
32
- data = (0..(variables - 1)).map { (0..(length - 1)).map { rand } }
33
- FastStatistics::Array2D.new(data, dtype: :float).mean.to_a
34
- puts
35
- puts
36
- end
@@ -8,6 +8,9 @@ if ENV["DEBUG"]
8
8
  $defs << "-DDEBUG"
9
9
  end
10
10
 
11
+ # Compile with C++11
12
+ $CXXFLAGS += " -std=c++11 "
13
+
11
14
  # Disable warnings
12
15
  [
13
16
  / -Wdeclaration-after-statement/,
@@ -1,3 +1,3 @@
1
1
  module FastStatistics
2
- VERSION = "0.2.0"
2
+ VERSION = "0.2.1"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fast_statistics
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.0
4
+ version: 0.2.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Martin Nyaga
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2021-03-24 00:00:00.000000000 Z
11
+ date: 2021-07-01 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description: Fast computation of descriptive statistics in ruby using native code
14
14
  and SIMD