bio-statsample-timeseries 0.2.0 → 0.2.1
Sign up to get free protection for your applications and to get access to all the features.
- data/README.md +26 -3
- data/README.rdoc +23 -1
- data/VERSION +1 -1
- data/lib/bio-statsample-timeseries/arima.rb +59 -40
- data/lib/bio-statsample-timeseries/arima/kalman.rb +37 -26
- data/lib/bio-statsample-timeseries/arima/likelihood.rb +1 -4
- data/lib/bio-statsample-timeseries/timeseries.rb +38 -47
- data/lib/bio-statsample-timeseries/timeseries/pacf.rb +38 -33
- data/lib/bio-statsample-timeseries/utility.rb +24 -21
- metadata +3 -3
data/README.md
CHANGED
@@ -2,10 +2,23 @@
|
|
2
2
|
|
3
3
|
[![Build Status](https://secure.travis-ci.org/AnkurGel/bioruby-statsample-timeseries.png)](http://travis-ci.org/ankurgel/bioruby-statsample-timeseries)
|
4
4
|
|
5
|
-
|
5
|
+
Statsample-Timeseries is an extension to [Statsample](https://github.com/clbustos/statsample), a suite of advance statistics in Ruby. It incorporates helpful timeseries functions, estimations and modules such as:
|
6
|
+
|
7
|
+
* ACF
|
8
|
+
* PACF
|
9
|
+
* ARIMA
|
10
|
+
* KalmanFilter
|
11
|
+
* LogLikelihood
|
12
|
+
* Autocovariances
|
13
|
+
* Moving Averages
|
6
14
|
|
7
15
|
Note: this software is under active development!
|
8
16
|
|
17
|
+
|
18
|
+
## Dependency
|
19
|
+
|
20
|
+
Please install [`rb-gsl`](http://rb-gsl.rubyforge.org/) which is a Ruby wrapper over GNU Scientific Library. It enables us to use various minimization techniques during estimations.
|
21
|
+
|
9
22
|
## Installation
|
10
23
|
|
11
24
|
```sh
|
@@ -18,9 +31,19 @@ Note: this software is under active development!
|
|
18
31
|
require 'bio-statsample-timeseries'
|
19
32
|
```
|
20
33
|
|
21
|
-
The API doc is online. For more code examples see the test files in
|
34
|
+
The [API doc](http://rubydoc.info/gems/bio-statsample-timeseries/0.2.0/frames) is online. For more code examples see the test files in
|
22
35
|
the source tree.
|
23
|
-
|
36
|
+
|
37
|
+
|
38
|
+
## Contributing
|
39
|
+
|
40
|
+
* Fork the project.
|
41
|
+
* Add/Modify code.
|
42
|
+
* Write equivalent documentation and **tests**
|
43
|
+
* Run `rake test` to verify that all test case passes
|
44
|
+
* Pull request. :)
|
45
|
+
|
46
|
+
|
24
47
|
## Project home page
|
25
48
|
|
26
49
|
Information on the source tree, documentation, examples, issues and
|
data/README.rdoc
CHANGED
@@ -4,10 +4,23 @@
|
|
4
4
|
src="https://secure.travis-ci.org/AnkurGel/bioruby-statsample-timeseries.png"
|
5
5
|
/>}[http://travis-ci.org/#!/AnkurGel/bioruby-statsample-timeseries]
|
6
6
|
|
7
|
-
|
7
|
+
Statsample-Timeseries is an extension to [Statsample](https://github.com/clbustos/statsample), a suite of advance statistics in Ruby. It incorporates helpful timeseries functions, estimations and modules such as:
|
8
|
+
|
9
|
+
* ACF
|
10
|
+
* PACF
|
11
|
+
* ARIMA
|
12
|
+
* KalmanFilter
|
13
|
+
* LogLikelihood
|
14
|
+
* Autocovariances
|
15
|
+
* Moving Averages
|
8
16
|
|
9
17
|
Note: this software is under active development!
|
10
18
|
|
19
|
+
== Dependency
|
20
|
+
|
21
|
+
Please install [`rb-gsl`](http://rb-gsl.rubyforge.org/) which is a Ruby wrapper over GNU Scientific Library. It enables us to use various minimization techniques during estimations.
|
22
|
+
|
23
|
+
|
11
24
|
== Installation
|
12
25
|
|
13
26
|
gem install bio-statsample-timeseries
|
@@ -22,6 +35,15 @@ To use the library
|
|
22
35
|
|
23
36
|
The API doc is online. For more code examples see also the test files in
|
24
37
|
the source tree.
|
38
|
+
|
39
|
+
|
40
|
+
== Contributing
|
41
|
+
|
42
|
+
* Fork the project.
|
43
|
+
* Add/Modify code.
|
44
|
+
* Write equivalent documentation and **tests**
|
45
|
+
* Run `rake test` to verify that all test case passes
|
46
|
+
* Pull request. :)
|
25
47
|
|
26
48
|
== Project home page
|
27
49
|
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.2.
|
1
|
+
0.2.1
|
@@ -13,22 +13,23 @@ module Statsample
|
|
13
13
|
class ARIMA < Statsample::Vector
|
14
14
|
include Statsample::TimeSeries
|
15
15
|
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
16
|
+
#= Kalman filter on ARIMA model
|
17
|
+
#== Params
|
18
|
+
#
|
19
|
+
#* *ts*: timeseries object
|
20
|
+
#* *p*: AR order
|
21
|
+
#* *i*: Integerated part order
|
22
|
+
#* *q*: MA order
|
23
|
+
#
|
24
|
+
#== Usage
|
24
25
|
# ts = (1..100).map { rand }.to_ts
|
25
|
-
# k_obj = ARIMA.ks(ts, 2, 1, 1)
|
26
|
+
# k_obj = Statsample::TimeSeries::ARIMA.ks(ts, 2, 1, 1)
|
26
27
|
# k_obj.ar
|
27
28
|
# #=> AR's phi coefficients
|
28
29
|
# k_obj.ma
|
29
30
|
# #=> MA's theta coefficients
|
30
31
|
#
|
31
|
-
|
32
|
+
#== Returns
|
32
33
|
#Kalman filter object
|
33
34
|
def self.ks(ts, p, i, q)
|
34
35
|
#prototype
|
@@ -46,25 +47,41 @@ module Statsample
|
|
46
47
|
#or Burg's algorithm(more efficient)
|
47
48
|
end
|
48
49
|
|
49
|
-
#Converts a linear array into a vector
|
50
|
+
#Converts a linear array into a Statsample vector
|
51
|
+
#== Parameters
|
52
|
+
#
|
53
|
+
#* *arr*: Array which has to be converted in Statsample vector
|
50
54
|
def create_vector(arr)
|
51
55
|
Statsample::Vector.new(arr, :scale)
|
52
56
|
end
|
53
57
|
|
54
|
-
|
58
|
+
#=Yule Walker
|
59
|
+
#Performs yule walker estimation on given timeseries, observations and order
|
60
|
+
#==Parameters
|
61
|
+
#
|
62
|
+
#* *ts*: timeseries object
|
63
|
+
#* *n* : number of observations
|
64
|
+
#* *k* : order
|
65
|
+
#
|
66
|
+
#==Returns
|
67
|
+
#phi and sigma vectors
|
55
68
|
def yule_walker(ts, n, k)
|
56
|
-
#parameters: timeseries, no of observations, order
|
57
|
-
#returns: simulated autoregression with phi parameters and sigma
|
58
69
|
phi, sigma = Pacf::Pacf.yule_walker(ts, k)
|
59
70
|
return phi, sigma
|
60
71
|
#return ar_sim(n, phi, sigma)
|
61
72
|
end
|
62
73
|
|
74
|
+
#=Levinson Durbin estimation
|
75
|
+
#Performs levinson durbin estimation on given timeseries, observations and order
|
76
|
+
#==Parameters
|
77
|
+
#
|
78
|
+
#* *ts*: timeseries object
|
79
|
+
#* *n* : number of observations
|
80
|
+
#* *k* : autoregressive order
|
81
|
+
#
|
82
|
+
#==Returns
|
83
|
+
#phi and sigma vectors
|
63
84
|
def levinson_durbin(ts, n, k)
|
64
|
-
#parameters;
|
65
|
-
#ts: timseries against which to generate phi coefficients
|
66
|
-
#n: number of observations for simulation
|
67
|
-
#k: order of AR
|
68
85
|
intermediate = Pacf::Pacf.levinson_durbin(ts, k)
|
69
86
|
phi, sigma = intermediate[1], intermediate[0]
|
70
87
|
return phi, sigma
|
@@ -75,19 +92,20 @@ module Statsample
|
|
75
92
|
#Simulates an autoregressive AR(p) model with specified number of
|
76
93
|
#observations(n), with phi number of values for order p and sigma.
|
77
94
|
#
|
78
|
-
|
95
|
+
#==Analysis:
|
96
|
+
# [http://ankurgoel.com/blog/2013/07/20/ar-ma-arma-acf-pacf-visualizations/](http://ankurgoel.com/blog/2013/07/20/ar-ma-arma-acf-pacf-visualizations/)
|
79
97
|
#
|
80
|
-
|
81
|
-
|
82
|
-
|
83
|
-
|
98
|
+
#==Parameters:
|
99
|
+
#* *n*: integer, number of observations
|
100
|
+
#* *phi* :array of phi values, e.g: [0.35, 0.213] for p = 2
|
101
|
+
#* *sigma*: float, sigma value for error generalization
|
84
102
|
#
|
85
|
-
|
103
|
+
#==Usage
|
86
104
|
# ar = ARIMA.new
|
87
105
|
# ar.ar_sim(1500, [0.3, 0.9], 0.12)
|
88
106
|
# # => AR(2) autoregressive series of 1500 values
|
89
107
|
#
|
90
|
-
|
108
|
+
#==Returns
|
91
109
|
#Array of generated autoregressive series against attributes
|
92
110
|
def ar_sim(n, phi, sigma)
|
93
111
|
#using random number generator for inclusion of white noise
|
@@ -122,16 +140,16 @@ module Statsample
|
|
122
140
|
#Simulates a moving average model with specified number of
|
123
141
|
#observations(n), with theta values for order k and sigma
|
124
142
|
#
|
125
|
-
|
126
|
-
|
127
|
-
|
128
|
-
|
143
|
+
#==Parameters
|
144
|
+
#* *n*: integer, number of observations
|
145
|
+
#* *theta*: array of floats, e.g: [0.23, 0.732], must be < 1
|
146
|
+
#* *sigma*: float, sigma value for whitenoise error
|
129
147
|
#
|
130
|
-
|
148
|
+
#==Usage
|
131
149
|
# ar = ARIMA.new
|
132
150
|
# ar.ma_sim(1500, [0.23, 0.732], 0.27)
|
133
151
|
#
|
134
|
-
|
152
|
+
#==Returns
|
135
153
|
#Array of generated MA(q) model
|
136
154
|
def ma_sim(n, theta, sigma)
|
137
155
|
#n is number of observations (eg: 1000)
|
@@ -158,7 +176,7 @@ module Statsample
|
|
158
176
|
x
|
159
177
|
end
|
160
178
|
|
161
|
-
|
179
|
+
#=ARMA(Autoregressive and Moving Average) Simulator
|
162
180
|
#ARMA is represented by:
|
163
181
|
#http://upload.wikimedia.org/math/2/e/d/2ed0485927b4370ae288f1bc1fe2fc8b.png
|
164
182
|
#This simulates the ARMA model against p, q and sigma.
|
@@ -166,19 +184,20 @@ module Statsample
|
|
166
184
|
#If q = 0, then model is pure AR(p),
|
167
185
|
#otherwise, model is ARMA(p, q) represented by above.
|
168
186
|
#
|
169
|
-
|
187
|
+
#==Detailed analysis:
|
188
|
+
# [http://ankurgoel.com/blog/2013/07/20/ar-ma-arma-acf-pacf-visualizations/](http://ankurgoel.com/blog/2013/07/20/ar-ma-arma-acf-pacf-visualizations/)
|
170
189
|
#
|
171
|
-
|
172
|
-
|
173
|
-
|
174
|
-
|
175
|
-
|
190
|
+
#==Parameters
|
191
|
+
#* *n*: integer, number of observations
|
192
|
+
#* *p*: array, contains p number of phi values for AR(p) process
|
193
|
+
#* *q*: array, contains q number of theta values for MA(q) process
|
194
|
+
#* *sigma*: float, sigma value for whitenoise error generation
|
176
195
|
#
|
177
|
-
|
196
|
+
#==Usage
|
178
197
|
# ar = ARIMA.new
|
179
198
|
# ar.arma_sim(1500, [0.3, 0.272], [0.8, 0.317], 0.92)
|
180
199
|
#
|
181
|
-
|
200
|
+
#==Returns
|
182
201
|
#array of generated ARMA model values
|
183
202
|
def arma_sim(n, p, q, sigma)
|
184
203
|
#represented by :
|
@@ -6,8 +6,22 @@ module Statsample
|
|
6
6
|
class KalmanFilter
|
7
7
|
include Statsample::TimeSeries
|
8
8
|
include GSL::MultiMin
|
9
|
-
|
10
|
-
|
9
|
+
|
10
|
+
#timeseries object
|
11
|
+
attr_accessor :ts
|
12
|
+
#Autoregressive order
|
13
|
+
attr_accessor :p
|
14
|
+
#Integerated part order
|
15
|
+
attr_accessor :i
|
16
|
+
#Moving average order
|
17
|
+
attr_accessor :q
|
18
|
+
|
19
|
+
# Autoregressive coefficients
|
20
|
+
attr_reader :ar
|
21
|
+
# Moving average coefficients
|
22
|
+
attr_reader :ma
|
23
|
+
|
24
|
+
#Creates a new KalmanFilter object and computes the likelihood
|
11
25
|
def initialize(ts=[].to_ts, p=0, i=0, q=0)
|
12
26
|
@ts = ts
|
13
27
|
@p = p
|
@@ -21,17 +35,14 @@ module Statsample
|
|
21
35
|
@p, @i, @q, @ts.size, @ts.to_a.join(','))
|
22
36
|
end
|
23
37
|
|
24
|
-
|
25
|
-
#Function which minimizes KalmanFilter.ll iteratively for initial parameters
|
26
|
-
|
27
|
-
|
28
|
-
|
29
|
-
|
30
|
-
|
31
|
-
|
32
|
-
#- KalmanFilter.ks(ts, 3, 1)
|
33
|
-
#NOTE: Suceptible to syntactical change later. Can be called directly on timeseries.
|
34
|
-
#NOTE: Return parameters
|
38
|
+
# = Kalman Filter
|
39
|
+
# Function which minimizes KalmanFilter.ll iteratively for initial parameters
|
40
|
+
# == Usage
|
41
|
+
# @s = [-1.16025577,0.64758021,0.77158601,0.14989543,2.31358162,3.49213868,1.14826956,0.58169457,-0.30813868,-0.34741084,-1.41175595,0.06040081, -0.78230232,0.86734837,0.95015787,-0.49781397,0.53247330,1.56495187,0.30936619,0.09750217,1.09698829,-0.81315490,-0.79425607,-0.64568547,-1.06460320,1.24647894,0.66695937,1.50284551,1.17631218,1.64082872,1.61462736,0.06443761,-0.17583741,0.83918339,0.46610988,-0.54915270,-0.56417108,-1.27696654,0.89460084,1.49970338,0.24520493,0.26249138,-1.33744834,-0.57725961,1.55819543,1.62143157,0.44421891,-0.74000084 ,0.57866347,3.51189333,2.39135077,1.73046244,1.81783890,0.21454040,0.43520890,-1.42443856,-2.72124685,-2.51313877,-1.20243091,-1.44268002 ,-0.16777305,0.05780661,2.03533992,0.39187242,0.54987983,0.57865693,-0.96592469,-0.93278473,-0.75962671,-0.63216906,1.06776183, 0.17476059 ,0.06635860,0.94906227,2.44498583,-1.04990407,-0.88440073,-1.99838258,-1.12955558,-0.62654882,-1.36589161,-2.67456821,-0.97187696, -0.84431782 ,-0.10051809,0.54239549,1.34622861,1.25598105,0.19707759,3.29286114,3.52423499,1.69146333,-0.10150024,0.45222903,-0.01730516, -0.49828727, -1.18484684,-1.09531773,-1.17190808,0.30207662].to_ts
|
42
|
+
# @kf=Statsample::TimeSeries::ARIMA.ks(@s,1,0,0)
|
43
|
+
# #=> ks is implictly called in above operation
|
44
|
+
# @kf.ar
|
45
|
+
# #=> AR coefficients
|
35
46
|
def ks
|
36
47
|
initial = Array.new((@p+@q), 0.0)
|
37
48
|
|
@@ -80,16 +91,16 @@ module Statsample
|
|
80
91
|
end
|
81
92
|
|
82
93
|
|
83
|
-
#=
|
94
|
+
#=Log Likelihood
|
84
95
|
#Computes Log likelihood on given parameters, ARMA order and timeseries
|
85
|
-
|
86
|
-
|
87
|
-
|
88
|
-
|
89
|
-
|
90
|
-
|
96
|
+
#==params
|
97
|
+
#* *params*: array of floats, contains phi/theta parameters
|
98
|
+
#* *timeseries*: timeseries object
|
99
|
+
#* *p*: integer, AR(p) order
|
100
|
+
#* *q*: integer, MA(q) order
|
101
|
+
#==Returns
|
91
102
|
#LogLikelihood object
|
92
|
-
|
103
|
+
#==Usage
|
93
104
|
# s = (1..100).map { rand }.to_ts
|
94
105
|
# p, q = 1, 0
|
95
106
|
# ll = KalmanFilter.log_likelihood([0.2], s, p, q)
|
@@ -104,12 +115,12 @@ module Statsample
|
|
104
115
|
#=T
|
105
116
|
#The coefficient matrix for the state vector in state equation
|
106
117
|
# It's dimensions is r+k x r+k
|
107
|
-
|
108
|
-
|
109
|
-
|
110
|
-
|
118
|
+
#==Parameters
|
119
|
+
#* *r*: integer, r is max(p, q+1), where p and q are orders of AR and MA respectively
|
120
|
+
#* *k*: integer, number of exogeneous variables in ARMA model
|
121
|
+
#* *q*: integer, The AR coefficient of ARMA model
|
111
122
|
|
112
|
-
|
123
|
+
#==References Statsmodels tsa, Durbin and Koopman Section 4.7
|
113
124
|
#def self.T(r, k, p)
|
114
125
|
# arr = Matrix.zero(r)
|
115
126
|
# params_padded = Statsample::Vector.new(Array.new(r, 0), :scale)
|
@@ -4,15 +4,12 @@ module Statsample
|
|
4
4
|
module KF
|
5
5
|
class LogLikelihood
|
6
6
|
|
7
|
-
#log_likelihood
|
8
7
|
#Gives log likelihood value of an ARMA(p, q) process on given parameters
|
9
8
|
attr_reader :log_likelihood
|
10
9
|
|
11
|
-
#sigma
|
12
10
|
#Gives sigma value of an ARMA(p,q) process on given parameters
|
13
11
|
attr_reader :sigma
|
14
12
|
|
15
|
-
#aic
|
16
13
|
#Gives AIC(Akaike Information Criterion)
|
17
14
|
#https://www.scss.tcd.ie/Rozenn.Dahyot/ST7005/13AICBIC.pdf
|
18
15
|
attr_reader :aic
|
@@ -25,7 +22,7 @@ module Statsample
|
|
25
22
|
ll
|
26
23
|
end
|
27
24
|
|
28
|
-
|
25
|
+
#===Log likelihood link function.
|
29
26
|
#iteratively minimized by simplex algorithm via KalmanFilter.ks
|
30
27
|
#Not meant to be used directly. Will make it private later.
|
31
28
|
def ll
|
@@ -49,20 +49,15 @@ module Statsample
|
|
49
49
|
|
50
50
|
#=Partial Autocorrelation
|
51
51
|
#Generates partial autocorrelation series for a timeseries
|
52
|
-
|
53
|
-
|
54
|
-
|
55
|
-
# *
|
56
|
-
# *
|
57
|
-
# *
|
58
|
-
|
59
|
-
#
|
52
|
+
#==Parameters
|
53
|
+
#* *max_lags*: integer, optional - provide number of lags
|
54
|
+
#* *method*: string. Default: 'yw'.
|
55
|
+
# * *yw*: For yule-walker algorithm unbiased approach
|
56
|
+
# * *mle*: For Maximum likelihood algorithm approach
|
57
|
+
# * *ld*: Forr Levinson-Durbin recursive approach
|
58
|
+
#==Returns
|
59
|
+
# array of pacf
|
60
60
|
def pacf(max_lags = nil, method = :yw)
|
61
|
-
#parameters:
|
62
|
-
#max_lags => maximum number of lags for pcf
|
63
|
-
#method => for autocovariance in yule_walker:
|
64
|
-
#'yw' for 'yule-walker unbaised', 'mle' for biased maximum likelihood
|
65
|
-
#'ld' for Levinson-Durbin recursion
|
66
61
|
|
67
62
|
method = method.downcase.to_sym
|
68
63
|
max_lags ||= (10 * Math.log10(size)).to_i
|
@@ -78,12 +73,11 @@ module Statsample
|
|
78
73
|
|
79
74
|
#=Autoregressive estimation
|
80
75
|
#Generates AR(k) series for the calling timeseries by yule walker.
|
81
|
-
|
82
|
-
|
83
|
-
|
84
|
-
|
76
|
+
#==Parameters
|
77
|
+
#* *n*: integer, (default = 1500) number of observations for AR.
|
78
|
+
#* *k*: integer, (default = 1) order of AR process.
|
79
|
+
#==Returns
|
85
80
|
#Array constituting estimated AR series.
|
86
|
-
#
|
87
81
|
def ar(n = 1500, k = 1)
|
88
82
|
series = Statsample::TimeSeries.arima
|
89
83
|
#series = Statsample::TimeSeries::ARIMA.new
|
@@ -92,11 +86,11 @@ module Statsample
|
|
92
86
|
|
93
87
|
#=AutoCovariance
|
94
88
|
#Provides autocovariance of timeseries.
|
95
|
-
|
96
|
-
|
97
|
-
|
98
|
-
|
99
|
-
#
|
89
|
+
#==Parameters
|
90
|
+
#* *demean* = true; optional. Supply false if series is not to be demeaned
|
91
|
+
#* *unbiased* = true; optional. true/false for unbiased/biased form of autocovariance
|
92
|
+
#==Returns
|
93
|
+
# Autocovariance value
|
100
94
|
def acvf(demean = true, unbiased = true)
|
101
95
|
#TODO: change parameters list in opts.merge as suggested by John
|
102
96
|
#functionality: computes autocovariance of timeseries data
|
@@ -123,7 +117,6 @@ module Statsample
|
|
123
117
|
|
124
118
|
#=Correlation
|
125
119
|
#Gives correlation of timeseries.
|
126
|
-
#
|
127
120
|
def correlate(a, v, mode = 'full')
|
128
121
|
#peforms cross-correlation of two series
|
129
122
|
#multiarray.correlate2(a, v, 'full')
|
@@ -173,15 +166,16 @@ module Statsample
|
|
173
166
|
# But, second difference of series is NOT X(t) - X(t-2)
|
174
167
|
# It is the first difference of the first difference
|
175
168
|
# => (X(t) - X(t-1)) - (X(t-1) - X(t-2))
|
176
|
-
|
177
|
-
|
178
|
-
|
169
|
+
#==Params
|
170
|
+
#* *max_lags*: integer, (default: 1), number of differences reqd.
|
171
|
+
#==Usage
|
179
172
|
#
|
180
173
|
# ts = (1..10).map { rand }.to_ts
|
181
174
|
# # => [0.69, 0.23, 0.44, 0.71, ...]
|
182
175
|
#
|
183
176
|
# ts.diff # => [nil, -0.46, 0.21, 0.27, ...]
|
184
|
-
|
177
|
+
#==Returns
|
178
|
+
# Timeseries object
|
185
179
|
def diff(max_lags = 1)
|
186
180
|
ts = self
|
187
181
|
difference = []
|
@@ -195,10 +189,10 @@ module Statsample
|
|
195
189
|
#=Moving Average
|
196
190
|
# Calculates the moving average of the series using the provided
|
197
191
|
# lookback argument. The lookback defaults to 10 periods.
|
198
|
-
|
199
|
-
|
192
|
+
#==Parameters
|
193
|
+
#* *n*: integer, (default = 10) - loopback argument
|
200
194
|
#
|
201
|
-
|
195
|
+
#==Usage
|
202
196
|
#
|
203
197
|
# ts = (1..100).map { rand }.to_ts
|
204
198
|
# # => [0.69, 0.23, 0.44, 0.71, ...]
|
@@ -206,7 +200,7 @@ module Statsample
|
|
206
200
|
# # first 9 observations are nil
|
207
201
|
# ts.ma # => [ ... nil, 0.484... , 0.445... , 0.513 ... , ... ]
|
208
202
|
#
|
209
|
-
|
203
|
+
#==Returns
|
210
204
|
#Resulting moving average timeseries object
|
211
205
|
def ma(n = 10)
|
212
206
|
return mean if n >= size
|
@@ -226,20 +220,18 @@ module Statsample
|
|
226
220
|
# use a lot more than n observations to calculate. The series is stable
|
227
221
|
# if the size of the series is >= 3.45 * (n + 1)
|
228
222
|
#
|
229
|
-
|
230
|
-
|
231
|
-
|
232
|
-
#if false, uses 2/(n+1) value
|
233
|
-
#
|
234
|
-
#*Usage*:
|
223
|
+
#==Parameters
|
224
|
+
#* *n*: integer, (default = 10)
|
225
|
+
#* *wilder*: boolean, (default = false), if true, 1/n value is used for smoothing; if false, uses 2/(n+1) value
|
235
226
|
#
|
227
|
+
#==Usage
|
236
228
|
# ts = (1..100).map { rand }.to_ts
|
237
229
|
# # => [0.69, 0.23, 0.44, 0.71, ...]
|
238
230
|
#
|
239
231
|
# # first 9 observations are nil
|
240
232
|
# ts.ema # => [ ... nil, 0.509... , 0.433..., ... ]
|
241
233
|
#
|
242
|
-
|
234
|
+
#==Returns
|
243
235
|
#EMA timeseries
|
244
236
|
def ema(n = 10, wilder = false)
|
245
237
|
smoother = wilder ? 1.0 / n : 2.0 / (n + 1)
|
@@ -264,19 +256,18 @@ module Statsample
|
|
264
256
|
# Calculates the MACD (moving average convergence-divergence) of the time
|
265
257
|
# series - this is a comparison of a fast EMA with a slow EMA.
|
266
258
|
#
|
267
|
-
|
268
|
-
|
269
|
-
|
270
|
-
|
259
|
+
#==Parameters*:
|
260
|
+
#* *fast*: integer, (default = 12) - fast component of MACD
|
261
|
+
#* *slow*: integer, (default = 26) - slow component of MACD
|
262
|
+
#* *signal*: integer, (default = 9) - signal component of MACD
|
271
263
|
#
|
272
|
-
|
264
|
+
#==Usage
|
273
265
|
# ts = (1..100).map { rand }.to_ts
|
274
266
|
# # => [0.69, 0.23, 0.44, 0.71, ...]
|
275
267
|
# ts.macd(13)
|
276
268
|
#
|
277
|
-
|
278
|
-
# Array of two timeseries - comparison of fast EMA with slow
|
279
|
-
# and EMA with signal value
|
269
|
+
#==Returns
|
270
|
+
# Array of two timeseries - comparison of fast EMA with slow and EMA with signal value
|
280
271
|
def macd(fast = 12, slow = 26, signal = 9)
|
281
272
|
series = ema(fast) - ema(slow)
|
282
273
|
[series, series.ema(signal)]
|
@@ -15,16 +15,16 @@ module Statsample
|
|
15
15
|
|
16
16
|
|
17
17
|
#=Levinson-Durbin Algorithm
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
18
|
+
#==Parameters
|
19
|
+
#* *series*: timeseries, or a series of autocovariances
|
20
|
+
#* *nlags*: integer(default: 10): largest lag to include in recursion or order of the AR process
|
21
|
+
#* *is_acovf*: boolean(default: false): series is timeseries if it is false, else contains autocavariances
|
22
|
+
#
|
23
|
+
#==Returns:
|
24
|
+
#* *sigma_v*: estimate of the error variance
|
25
|
+
#* *arcoefs*: AR coefficients
|
26
|
+
#* *pacf*: pacf function
|
27
|
+
#* *sigma*: some function
|
28
28
|
def self.levinson_durbin(series, nlags = 10, is_acovf = false)
|
29
29
|
|
30
30
|
if is_acovf
|
@@ -60,26 +60,25 @@ module Statsample
|
|
60
60
|
return [sigma_v, arcoefs, pacf, sig, phi]
|
61
61
|
end
|
62
62
|
|
63
|
+
#Returns diagonal elements of matrices
|
64
|
+
# Will later abstract it to utilities
|
63
65
|
def self.diag(mat)
|
64
|
-
#returns array of diagonal elements of a matrix.
|
65
|
-
#will later abstract it to matrix.rb in Statsample
|
66
66
|
return mat.each_with_index(:diagonal).map { |x, r, c| x }
|
67
67
|
end
|
68
68
|
|
69
69
|
|
70
70
|
#=Yule Walker Algorithm
|
71
|
-
#From the series, estimates AR(p)(autoregressive) parameter
|
72
|
-
#using Yule-Waler equation. See -
|
71
|
+
#From the series, estimates AR(p)(autoregressive) parameter using Yule-Waler equation. See -
|
73
72
|
#http://en.wikipedia.org/wiki/Autoregressive_moving_average_model
|
74
|
-
|
75
|
-
|
76
|
-
|
77
|
-
|
78
|
-
|
79
|
-
|
80
|
-
|
81
|
-
|
82
|
-
|
73
|
+
#
|
74
|
+
#==Parameters
|
75
|
+
#* *ts*: timeseries
|
76
|
+
#* *k*: order, default = 1
|
77
|
+
#* *method*: can be 'yw' or 'mle'. If 'yw' then it is unbiased, denominator is (n - k)
|
78
|
+
#
|
79
|
+
#==Returns
|
80
|
+
#* *rho*: autoregressive coefficients
|
81
|
+
#* *sigma*: sigma parameter
|
83
82
|
def self.yule_walker(ts, k = 1, method='yw')
|
84
83
|
ts = ts - ts.mean
|
85
84
|
n = ts.size
|
@@ -110,17 +109,21 @@ module Statsample
|
|
110
109
|
return [phi, sigma]
|
111
110
|
end
|
112
111
|
|
112
|
+
#=ToEplitz
|
113
|
+
# Generates teoeplitz matrix from an array
|
114
|
+
#http://en.wikipedia.org/wiki/Toeplitz_matrix
|
115
|
+
#Toeplitz matrix are equal when they are stored in row & column major
|
116
|
+
#==Parameters
|
117
|
+
#* *arr*: array of integers;
|
118
|
+
#==Usage
|
119
|
+
# arr = [0,1,2,3]
|
120
|
+
# Pacf.toeplitz(arr)
|
121
|
+
#==Returns
|
122
|
+
# [[0, 1, 2, 3],
|
123
|
+
# [1, 0, 1, 2],
|
124
|
+
# [2, 1, 0, 1],
|
125
|
+
# [3, 2, 1, 0]]
|
113
126
|
def self.toeplitz(arr)
|
114
|
-
#Generates Toeplitz matrix -
|
115
|
-
#http://en.wikipedia.org/wiki/Toeplitz_matrix
|
116
|
-
#Toeplitz matrix are equal when they are stored in row &
|
117
|
-
#column major
|
118
|
-
#=> arr = [0, 1, 2, 3]
|
119
|
-
#=> result:
|
120
|
-
#[[0, 1, 2, 3],
|
121
|
-
# [1, 0, 1, 2],
|
122
|
-
# [2, 1, 0, 1],
|
123
|
-
# [3, 2, 1, 0]]
|
124
127
|
eplitz_matrix = Array.new(arr.size) { Array.new(arr.size) }
|
125
128
|
|
126
129
|
0.upto(arr.size - 1) do |i|
|
@@ -140,6 +143,8 @@ module Statsample
|
|
140
143
|
eplitz_matrix
|
141
144
|
end
|
142
145
|
|
146
|
+
#===Solves matrix equations
|
147
|
+
#Solves for X in AX = B
|
143
148
|
def self.solve_matrix(matrix, out_vector)
|
144
149
|
solution_vector = Array.new(out_vector.size, 0)
|
145
150
|
matrix = matrix.to_a
|
@@ -5,36 +5,35 @@ module Statsample
|
|
5
5
|
include Summarizable
|
6
6
|
|
7
7
|
#=Squares of sum
|
8
|
-
|
9
|
-
|
10
|
-
|
8
|
+
#==Parameter
|
9
|
+
#* *demean*: boolean - optional. __default__: false
|
10
|
+
#==Returns
|
11
11
|
#Sums the timeseries and then returns the square
|
12
12
|
def squares_of_sum(demean = false)
|
13
13
|
if demean
|
14
14
|
m = self.mean
|
15
|
-
self.map { |x| (x-m) }.sum
|
15
|
+
self.map { |x| (x-m) }.sum**2
|
16
16
|
else
|
17
|
-
return self.sum.to_f
|
17
|
+
return self.sum.to_f**2
|
18
18
|
end
|
19
19
|
end
|
20
20
|
end
|
21
21
|
|
22
22
|
|
23
23
|
class ::Matrix
|
24
|
-
|
25
|
-
#---
|
24
|
+
#==Squares of sum
|
26
25
|
#Does squares of sum in column order.
|
27
26
|
#Necessary for computations in various processes
|
28
27
|
def squares_of_sum
|
29
28
|
(0...column_size).map do |j|
|
30
|
-
self.column(j).sum
|
29
|
+
self.column(j).sum**2
|
31
30
|
end
|
32
31
|
end
|
33
32
|
|
34
|
-
|
35
|
-
#---
|
36
|
-
#returns bool
|
33
|
+
#==Symmetric?
|
37
34
|
#`symmetric?` is present in Ruby Matrix 1.9.3+, but not in 1.8.*
|
35
|
+
#===Returns
|
36
|
+
# bool
|
38
37
|
def symmetric?
|
39
38
|
return false unless square?
|
40
39
|
|
@@ -46,15 +45,16 @@ module Statsample
|
|
46
45
|
true
|
47
46
|
end
|
48
47
|
|
49
|
-
|
48
|
+
#==Cholesky decomposition
|
50
49
|
#Reference: http://en.wikipedia.org/wiki/Cholesky_decomposition
|
51
|
-
|
52
|
-
#==Description
|
50
|
+
#===Description
|
53
51
|
#Cholesky decomposition is reprsented by `M = L X L*`, where
|
54
52
|
#M is the symmetric matrix and `L` is the lower half of cholesky matrix,
|
55
53
|
#and `L*` is the conjugate form of `L`.
|
56
|
-
|
57
|
-
|
54
|
+
#===Returns
|
55
|
+
# Cholesky decomposition for a given matrix(if symmetric)
|
56
|
+
#===Utility
|
57
|
+
# Essential matrix function, requisite in kalman filter, least squares
|
58
58
|
def cholesky
|
59
59
|
raise ArgumentError, "Given matrix should be symmetric" unless symmetric?
|
60
60
|
c = Matrix.zero(row_size)
|
@@ -74,15 +74,16 @@ module Statsample
|
|
74
74
|
c
|
75
75
|
end
|
76
76
|
|
77
|
-
|
77
|
+
#==Chain Product
|
78
78
|
#Class method
|
79
79
|
#Returns the chain product of two matrices
|
80
|
-
|
80
|
+
#===Usage:
|
81
81
|
#Let `a` be 4 * 3 matrix,
|
82
82
|
#Let `b` be 3 * 3 matrix,
|
83
83
|
#Let `c` be 3 * 1 matrix,
|
84
84
|
#then `Matrix.chain_dot(a, b, c)`
|
85
|
-
|
85
|
+
#===NOTE:
|
86
|
+
# Send the matrices in multiplicative order with proper dimensions
|
86
87
|
def self.chain_dot(*args)
|
87
88
|
#inspired by Statsmodels
|
88
89
|
begin
|
@@ -93,7 +94,7 @@ module Statsample
|
|
93
94
|
end
|
94
95
|
|
95
96
|
|
96
|
-
|
97
|
+
#==Adds a column of constants.
|
97
98
|
#Appends a column of ones to the matrix/array if first argument is false
|
98
99
|
#If an n-array, first checks if one column of ones is already present
|
99
100
|
#if present, then original(self) is returned, else, prepends with a vector of ones
|
@@ -115,6 +116,7 @@ module Statsample
|
|
115
116
|
return Matrix.rows(vectors)
|
116
117
|
end
|
117
118
|
|
119
|
+
#populates column i of given matrix with arr
|
118
120
|
def set_column(i, arr)
|
119
121
|
columns = self.column_vectors
|
120
122
|
column = columns[i].to_a
|
@@ -122,7 +124,8 @@ module Statsample
|
|
122
124
|
columns[i] = column
|
123
125
|
return Matrix.columns(columns)
|
124
126
|
end
|
125
|
-
|
127
|
+
|
128
|
+
#populates row i of given matrix with arr
|
126
129
|
def set_row(i, arr)
|
127
130
|
#similar implementation as set_column
|
128
131
|
#writing and commenting metaprogrammed version
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bio-statsample-timeseries
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.2.
|
4
|
+
version: 0.2.1
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -10,7 +10,7 @@ authors:
|
|
10
10
|
autorequire:
|
11
11
|
bindir: bin
|
12
12
|
cert_chain: []
|
13
|
-
date: 2013-09-
|
13
|
+
date: 2013-09-23 00:00:00.000000000 Z
|
14
14
|
dependencies:
|
15
15
|
- !ruby/object:Gem::Dependency
|
16
16
|
name: statsample
|
@@ -260,7 +260,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
260
260
|
version: '0'
|
261
261
|
segments:
|
262
262
|
- 0
|
263
|
-
hash:
|
263
|
+
hash: -258712685
|
264
264
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
265
265
|
none: false
|
266
266
|
requirements:
|