bio-statsample-timeseries 0.2.0 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/README.md +26 -3
- data/README.rdoc +23 -1
- data/VERSION +1 -1
- data/lib/bio-statsample-timeseries/arima.rb +59 -40
- data/lib/bio-statsample-timeseries/arima/kalman.rb +37 -26
- data/lib/bio-statsample-timeseries/arima/likelihood.rb +1 -4
- data/lib/bio-statsample-timeseries/timeseries.rb +38 -47
- data/lib/bio-statsample-timeseries/timeseries/pacf.rb +38 -33
- data/lib/bio-statsample-timeseries/utility.rb +24 -21
- metadata +3 -3
data/README.md
CHANGED
@@ -2,10 +2,23 @@
|
|
2
2
|
|
3
3
|
[](http://travis-ci.org/ankurgel/bioruby-statsample-timeseries)
|
4
4
|
|
5
|
-
|
5
|
+
Statsample-Timeseries is an extension to [Statsample](https://github.com/clbustos/statsample), a suite of advance statistics in Ruby. It incorporates helpful timeseries functions, estimations and modules such as:
|
6
|
+
|
7
|
+
* ACF
|
8
|
+
* PACF
|
9
|
+
* ARIMA
|
10
|
+
* KalmanFilter
|
11
|
+
* LogLikelihood
|
12
|
+
* Autocovariances
|
13
|
+
* Moving Averages
|
6
14
|
|
7
15
|
Note: this software is under active development!
|
8
16
|
|
17
|
+
|
18
|
+
## Dependency
|
19
|
+
|
20
|
+
Please install [`rb-gsl`](http://rb-gsl.rubyforge.org/) which is a Ruby wrapper over GNU Scientific Library. It enables us to use various minimization techniques during estimations.
|
21
|
+
|
9
22
|
## Installation
|
10
23
|
|
11
24
|
```sh
|
@@ -18,9 +31,19 @@ Note: this software is under active development!
|
|
18
31
|
require 'bio-statsample-timeseries'
|
19
32
|
```
|
20
33
|
|
21
|
-
The API doc is online. For more code examples see the test files in
|
34
|
+
The [API doc](http://rubydoc.info/gems/bio-statsample-timeseries/0.2.0/frames) is online. For more code examples see the test files in
|
22
35
|
the source tree.
|
23
|
-
|
36
|
+
|
37
|
+
|
38
|
+
## Contributing
|
39
|
+
|
40
|
+
* Fork the project.
|
41
|
+
* Add/Modify code.
|
42
|
+
* Write equivalent documentation and **tests**
|
43
|
+
* Run `rake test` to verify that all test case passes
|
44
|
+
* Pull request. :)
|
45
|
+
|
46
|
+
|
24
47
|
## Project home page
|
25
48
|
|
26
49
|
Information on the source tree, documentation, examples, issues and
|
data/README.rdoc
CHANGED
@@ -4,10 +4,23 @@
|
|
4
4
|
src="https://secure.travis-ci.org/AnkurGel/bioruby-statsample-timeseries.png"
|
5
5
|
/>}[http://travis-ci.org/#!/AnkurGel/bioruby-statsample-timeseries]
|
6
6
|
|
7
|
-
|
7
|
+
Statsample-Timeseries is an extension to [Statsample](https://github.com/clbustos/statsample), a suite of advance statistics in Ruby. It incorporates helpful timeseries functions, estimations and modules such as:
|
8
|
+
|
9
|
+
* ACF
|
10
|
+
* PACF
|
11
|
+
* ARIMA
|
12
|
+
* KalmanFilter
|
13
|
+
* LogLikelihood
|
14
|
+
* Autocovariances
|
15
|
+
* Moving Averages
|
8
16
|
|
9
17
|
Note: this software is under active development!
|
10
18
|
|
19
|
+
== Dependency
|
20
|
+
|
21
|
+
Please install [`rb-gsl`](http://rb-gsl.rubyforge.org/) which is a Ruby wrapper over GNU Scientific Library. It enables us to use various minimization techniques during estimations.
|
22
|
+
|
23
|
+
|
11
24
|
== Installation
|
12
25
|
|
13
26
|
gem install bio-statsample-timeseries
|
@@ -22,6 +35,15 @@ To use the library
|
|
22
35
|
|
23
36
|
The API doc is online. For more code examples see also the test files in
|
24
37
|
the source tree.
|
38
|
+
|
39
|
+
|
40
|
+
== Contributing
|
41
|
+
|
42
|
+
* Fork the project.
|
43
|
+
* Add/Modify code.
|
44
|
+
* Write equivalent documentation and **tests**
|
45
|
+
* Run `rake test` to verify that all test case passes
|
46
|
+
* Pull request. :)
|
25
47
|
|
26
48
|
== Project home page
|
27
49
|
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.2.
|
1
|
+
0.2.1
|
@@ -13,22 +13,23 @@ module Statsample
|
|
13
13
|
class ARIMA < Statsample::Vector
|
14
14
|
include Statsample::TimeSeries
|
15
15
|
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
16
|
+
#= Kalman filter on ARIMA model
|
17
|
+
#== Params
|
18
|
+
#
|
19
|
+
#* *ts*: timeseries object
|
20
|
+
#* *p*: AR order
|
21
|
+
#* *i*: Integerated part order
|
22
|
+
#* *q*: MA order
|
23
|
+
#
|
24
|
+
#== Usage
|
24
25
|
# ts = (1..100).map { rand }.to_ts
|
25
|
-
# k_obj = ARIMA.ks(ts, 2, 1, 1)
|
26
|
+
# k_obj = Statsample::TimeSeries::ARIMA.ks(ts, 2, 1, 1)
|
26
27
|
# k_obj.ar
|
27
28
|
# #=> AR's phi coefficients
|
28
29
|
# k_obj.ma
|
29
30
|
# #=> MA's theta coefficients
|
30
31
|
#
|
31
|
-
|
32
|
+
#== Returns
|
32
33
|
#Kalman filter object
|
33
34
|
def self.ks(ts, p, i, q)
|
34
35
|
#prototype
|
@@ -46,25 +47,41 @@ module Statsample
|
|
46
47
|
#or Burg's algorithm(more efficient)
|
47
48
|
end
|
48
49
|
|
49
|
-
#Converts a linear array into a vector
|
50
|
+
#Converts a linear array into a Statsample vector
|
51
|
+
#== Parameters
|
52
|
+
#
|
53
|
+
#* *arr*: Array which has to be converted in Statsample vector
|
50
54
|
def create_vector(arr)
|
51
55
|
Statsample::Vector.new(arr, :scale)
|
52
56
|
end
|
53
57
|
|
54
|
-
|
58
|
+
#=Yule Walker
|
59
|
+
#Performs yule walker estimation on given timeseries, observations and order
|
60
|
+
#==Parameters
|
61
|
+
#
|
62
|
+
#* *ts*: timeseries object
|
63
|
+
#* *n* : number of observations
|
64
|
+
#* *k* : order
|
65
|
+
#
|
66
|
+
#==Returns
|
67
|
+
#phi and sigma vectors
|
55
68
|
def yule_walker(ts, n, k)
|
56
|
-
#parameters: timeseries, no of observations, order
|
57
|
-
#returns: simulated autoregression with phi parameters and sigma
|
58
69
|
phi, sigma = Pacf::Pacf.yule_walker(ts, k)
|
59
70
|
return phi, sigma
|
60
71
|
#return ar_sim(n, phi, sigma)
|
61
72
|
end
|
62
73
|
|
74
|
+
#=Levinson Durbin estimation
|
75
|
+
#Performs levinson durbin estimation on given timeseries, observations and order
|
76
|
+
#==Parameters
|
77
|
+
#
|
78
|
+
#* *ts*: timeseries object
|
79
|
+
#* *n* : number of observations
|
80
|
+
#* *k* : autoregressive order
|
81
|
+
#
|
82
|
+
#==Returns
|
83
|
+
#phi and sigma vectors
|
63
84
|
def levinson_durbin(ts, n, k)
|
64
|
-
#parameters;
|
65
|
-
#ts: timseries against which to generate phi coefficients
|
66
|
-
#n: number of observations for simulation
|
67
|
-
#k: order of AR
|
68
85
|
intermediate = Pacf::Pacf.levinson_durbin(ts, k)
|
69
86
|
phi, sigma = intermediate[1], intermediate[0]
|
70
87
|
return phi, sigma
|
@@ -75,19 +92,20 @@ module Statsample
|
|
75
92
|
#Simulates an autoregressive AR(p) model with specified number of
|
76
93
|
#observations(n), with phi number of values for order p and sigma.
|
77
94
|
#
|
78
|
-
|
95
|
+
#==Analysis:
|
96
|
+
# [http://ankurgoel.com/blog/2013/07/20/ar-ma-arma-acf-pacf-visualizations/](http://ankurgoel.com/blog/2013/07/20/ar-ma-arma-acf-pacf-visualizations/)
|
79
97
|
#
|
80
|
-
|
81
|
-
|
82
|
-
|
83
|
-
|
98
|
+
#==Parameters:
|
99
|
+
#* *n*: integer, number of observations
|
100
|
+
#* *phi* :array of phi values, e.g: [0.35, 0.213] for p = 2
|
101
|
+
#* *sigma*: float, sigma value for error generalization
|
84
102
|
#
|
85
|
-
|
103
|
+
#==Usage
|
86
104
|
# ar = ARIMA.new
|
87
105
|
# ar.ar_sim(1500, [0.3, 0.9], 0.12)
|
88
106
|
# # => AR(2) autoregressive series of 1500 values
|
89
107
|
#
|
90
|
-
|
108
|
+
#==Returns
|
91
109
|
#Array of generated autoregressive series against attributes
|
92
110
|
def ar_sim(n, phi, sigma)
|
93
111
|
#using random number generator for inclusion of white noise
|
@@ -122,16 +140,16 @@ module Statsample
|
|
122
140
|
#Simulates a moving average model with specified number of
|
123
141
|
#observations(n), with theta values for order k and sigma
|
124
142
|
#
|
125
|
-
|
126
|
-
|
127
|
-
|
128
|
-
|
143
|
+
#==Parameters
|
144
|
+
#* *n*: integer, number of observations
|
145
|
+
#* *theta*: array of floats, e.g: [0.23, 0.732], must be < 1
|
146
|
+
#* *sigma*: float, sigma value for whitenoise error
|
129
147
|
#
|
130
|
-
|
148
|
+
#==Usage
|
131
149
|
# ar = ARIMA.new
|
132
150
|
# ar.ma_sim(1500, [0.23, 0.732], 0.27)
|
133
151
|
#
|
134
|
-
|
152
|
+
#==Returns
|
135
153
|
#Array of generated MA(q) model
|
136
154
|
def ma_sim(n, theta, sigma)
|
137
155
|
#n is number of observations (eg: 1000)
|
@@ -158,7 +176,7 @@ module Statsample
|
|
158
176
|
x
|
159
177
|
end
|
160
178
|
|
161
|
-
|
179
|
+
#=ARMA(Autoregressive and Moving Average) Simulator
|
162
180
|
#ARMA is represented by:
|
163
181
|
#http://upload.wikimedia.org/math/2/e/d/2ed0485927b4370ae288f1bc1fe2fc8b.png
|
164
182
|
#This simulates the ARMA model against p, q and sigma.
|
@@ -166,19 +184,20 @@ module Statsample
|
|
166
184
|
#If q = 0, then model is pure AR(p),
|
167
185
|
#otherwise, model is ARMA(p, q) represented by above.
|
168
186
|
#
|
169
|
-
|
187
|
+
#==Detailed analysis:
|
188
|
+
# [http://ankurgoel.com/blog/2013/07/20/ar-ma-arma-acf-pacf-visualizations/](http://ankurgoel.com/blog/2013/07/20/ar-ma-arma-acf-pacf-visualizations/)
|
170
189
|
#
|
171
|
-
|
172
|
-
|
173
|
-
|
174
|
-
|
175
|
-
|
190
|
+
#==Parameters
|
191
|
+
#* *n*: integer, number of observations
|
192
|
+
#* *p*: array, contains p number of phi values for AR(p) process
|
193
|
+
#* *q*: array, contains q number of theta values for MA(q) process
|
194
|
+
#* *sigma*: float, sigma value for whitenoise error generation
|
176
195
|
#
|
177
|
-
|
196
|
+
#==Usage
|
178
197
|
# ar = ARIMA.new
|
179
198
|
# ar.arma_sim(1500, [0.3, 0.272], [0.8, 0.317], 0.92)
|
180
199
|
#
|
181
|
-
|
200
|
+
#==Returns
|
182
201
|
#array of generated ARMA model values
|
183
202
|
def arma_sim(n, p, q, sigma)
|
184
203
|
#represented by :
|
@@ -6,8 +6,22 @@ module Statsample
|
|
6
6
|
class KalmanFilter
|
7
7
|
include Statsample::TimeSeries
|
8
8
|
include GSL::MultiMin
|
9
|
-
|
10
|
-
|
9
|
+
|
10
|
+
#timeseries object
|
11
|
+
attr_accessor :ts
|
12
|
+
#Autoregressive order
|
13
|
+
attr_accessor :p
|
14
|
+
#Integerated part order
|
15
|
+
attr_accessor :i
|
16
|
+
#Moving average order
|
17
|
+
attr_accessor :q
|
18
|
+
|
19
|
+
# Autoregressive coefficients
|
20
|
+
attr_reader :ar
|
21
|
+
# Moving average coefficients
|
22
|
+
attr_reader :ma
|
23
|
+
|
24
|
+
#Creates a new KalmanFilter object and computes the likelihood
|
11
25
|
def initialize(ts=[].to_ts, p=0, i=0, q=0)
|
12
26
|
@ts = ts
|
13
27
|
@p = p
|
@@ -21,17 +35,14 @@ module Statsample
|
|
21
35
|
@p, @i, @q, @ts.size, @ts.to_a.join(','))
|
22
36
|
end
|
23
37
|
|
24
|
-
|
25
|
-
#Function which minimizes KalmanFilter.ll iteratively for initial parameters
|
26
|
-
|
27
|
-
|
28
|
-
|
29
|
-
|
30
|
-
|
31
|
-
|
32
|
-
#- KalmanFilter.ks(ts, 3, 1)
|
33
|
-
#NOTE: Suceptible to syntactical change later. Can be called directly on timeseries.
|
34
|
-
#NOTE: Return parameters
|
38
|
+
# = Kalman Filter
|
39
|
+
# Function which minimizes KalmanFilter.ll iteratively for initial parameters
|
40
|
+
# == Usage
|
41
|
+
# @s = [-1.16025577,0.64758021,0.77158601,0.14989543,2.31358162,3.49213868,1.14826956,0.58169457,-0.30813868,-0.34741084,-1.41175595,0.06040081, -0.78230232,0.86734837,0.95015787,-0.49781397,0.53247330,1.56495187,0.30936619,0.09750217,1.09698829,-0.81315490,-0.79425607,-0.64568547,-1.06460320,1.24647894,0.66695937,1.50284551,1.17631218,1.64082872,1.61462736,0.06443761,-0.17583741,0.83918339,0.46610988,-0.54915270,-0.56417108,-1.27696654,0.89460084,1.49970338,0.24520493,0.26249138,-1.33744834,-0.57725961,1.55819543,1.62143157,0.44421891,-0.74000084 ,0.57866347,3.51189333,2.39135077,1.73046244,1.81783890,0.21454040,0.43520890,-1.42443856,-2.72124685,-2.51313877,-1.20243091,-1.44268002 ,-0.16777305,0.05780661,2.03533992,0.39187242,0.54987983,0.57865693,-0.96592469,-0.93278473,-0.75962671,-0.63216906,1.06776183, 0.17476059 ,0.06635860,0.94906227,2.44498583,-1.04990407,-0.88440073,-1.99838258,-1.12955558,-0.62654882,-1.36589161,-2.67456821,-0.97187696, -0.84431782 ,-0.10051809,0.54239549,1.34622861,1.25598105,0.19707759,3.29286114,3.52423499,1.69146333,-0.10150024,0.45222903,-0.01730516, -0.49828727, -1.18484684,-1.09531773,-1.17190808,0.30207662].to_ts
|
42
|
+
# @kf=Statsample::TimeSeries::ARIMA.ks(@s,1,0,0)
|
43
|
+
# #=> ks is implictly called in above operation
|
44
|
+
# @kf.ar
|
45
|
+
# #=> AR coefficients
|
35
46
|
def ks
|
36
47
|
initial = Array.new((@p+@q), 0.0)
|
37
48
|
|
@@ -80,16 +91,16 @@ module Statsample
|
|
80
91
|
end
|
81
92
|
|
82
93
|
|
83
|
-
#=
|
94
|
+
#=Log Likelihood
|
84
95
|
#Computes Log likelihood on given parameters, ARMA order and timeseries
|
85
|
-
|
86
|
-
|
87
|
-
|
88
|
-
|
89
|
-
|
90
|
-
|
96
|
+
#==params
|
97
|
+
#* *params*: array of floats, contains phi/theta parameters
|
98
|
+
#* *timeseries*: timeseries object
|
99
|
+
#* *p*: integer, AR(p) order
|
100
|
+
#* *q*: integer, MA(q) order
|
101
|
+
#==Returns
|
91
102
|
#LogLikelihood object
|
92
|
-
|
103
|
+
#==Usage
|
93
104
|
# s = (1..100).map { rand }.to_ts
|
94
105
|
# p, q = 1, 0
|
95
106
|
# ll = KalmanFilter.log_likelihood([0.2], s, p, q)
|
@@ -104,12 +115,12 @@ module Statsample
|
|
104
115
|
#=T
|
105
116
|
#The coefficient matrix for the state vector in state equation
|
106
117
|
# It's dimensions is r+k x r+k
|
107
|
-
|
108
|
-
|
109
|
-
|
110
|
-
|
118
|
+
#==Parameters
|
119
|
+
#* *r*: integer, r is max(p, q+1), where p and q are orders of AR and MA respectively
|
120
|
+
#* *k*: integer, number of exogeneous variables in ARMA model
|
121
|
+
#* *q*: integer, The AR coefficient of ARMA model
|
111
122
|
|
112
|
-
|
123
|
+
#==References Statsmodels tsa, Durbin and Koopman Section 4.7
|
113
124
|
#def self.T(r, k, p)
|
114
125
|
# arr = Matrix.zero(r)
|
115
126
|
# params_padded = Statsample::Vector.new(Array.new(r, 0), :scale)
|
@@ -4,15 +4,12 @@ module Statsample
|
|
4
4
|
module KF
|
5
5
|
class LogLikelihood
|
6
6
|
|
7
|
-
#log_likelihood
|
8
7
|
#Gives log likelihood value of an ARMA(p, q) process on given parameters
|
9
8
|
attr_reader :log_likelihood
|
10
9
|
|
11
|
-
#sigma
|
12
10
|
#Gives sigma value of an ARMA(p,q) process on given parameters
|
13
11
|
attr_reader :sigma
|
14
12
|
|
15
|
-
#aic
|
16
13
|
#Gives AIC(Akaike Information Criterion)
|
17
14
|
#https://www.scss.tcd.ie/Rozenn.Dahyot/ST7005/13AICBIC.pdf
|
18
15
|
attr_reader :aic
|
@@ -25,7 +22,7 @@ module Statsample
|
|
25
22
|
ll
|
26
23
|
end
|
27
24
|
|
28
|
-
|
25
|
+
#===Log likelihood link function.
|
29
26
|
#iteratively minimized by simplex algorithm via KalmanFilter.ks
|
30
27
|
#Not meant to be used directly. Will make it private later.
|
31
28
|
def ll
|
@@ -49,20 +49,15 @@ module Statsample
|
|
49
49
|
|
50
50
|
#=Partial Autocorrelation
|
51
51
|
#Generates partial autocorrelation series for a timeseries
|
52
|
-
|
53
|
-
|
54
|
-
|
55
|
-
# *
|
56
|
-
# *
|
57
|
-
# *
|
58
|
-
|
59
|
-
#
|
52
|
+
#==Parameters
|
53
|
+
#* *max_lags*: integer, optional - provide number of lags
|
54
|
+
#* *method*: string. Default: 'yw'.
|
55
|
+
# * *yw*: For yule-walker algorithm unbiased approach
|
56
|
+
# * *mle*: For Maximum likelihood algorithm approach
|
57
|
+
# * *ld*: Forr Levinson-Durbin recursive approach
|
58
|
+
#==Returns
|
59
|
+
# array of pacf
|
60
60
|
def pacf(max_lags = nil, method = :yw)
|
61
|
-
#parameters:
|
62
|
-
#max_lags => maximum number of lags for pcf
|
63
|
-
#method => for autocovariance in yule_walker:
|
64
|
-
#'yw' for 'yule-walker unbaised', 'mle' for biased maximum likelihood
|
65
|
-
#'ld' for Levinson-Durbin recursion
|
66
61
|
|
67
62
|
method = method.downcase.to_sym
|
68
63
|
max_lags ||= (10 * Math.log10(size)).to_i
|
@@ -78,12 +73,11 @@ module Statsample
|
|
78
73
|
|
79
74
|
#=Autoregressive estimation
|
80
75
|
#Generates AR(k) series for the calling timeseries by yule walker.
|
81
|
-
|
82
|
-
|
83
|
-
|
84
|
-
|
76
|
+
#==Parameters
|
77
|
+
#* *n*: integer, (default = 1500) number of observations for AR.
|
78
|
+
#* *k*: integer, (default = 1) order of AR process.
|
79
|
+
#==Returns
|
85
80
|
#Array constituting estimated AR series.
|
86
|
-
#
|
87
81
|
def ar(n = 1500, k = 1)
|
88
82
|
series = Statsample::TimeSeries.arima
|
89
83
|
#series = Statsample::TimeSeries::ARIMA.new
|
@@ -92,11 +86,11 @@ module Statsample
|
|
92
86
|
|
93
87
|
#=AutoCovariance
|
94
88
|
#Provides autocovariance of timeseries.
|
95
|
-
|
96
|
-
|
97
|
-
|
98
|
-
|
99
|
-
#
|
89
|
+
#==Parameters
|
90
|
+
#* *demean* = true; optional. Supply false if series is not to be demeaned
|
91
|
+
#* *unbiased* = true; optional. true/false for unbiased/biased form of autocovariance
|
92
|
+
#==Returns
|
93
|
+
# Autocovariance value
|
100
94
|
def acvf(demean = true, unbiased = true)
|
101
95
|
#TODO: change parameters list in opts.merge as suggested by John
|
102
96
|
#functionality: computes autocovariance of timeseries data
|
@@ -123,7 +117,6 @@ module Statsample
|
|
123
117
|
|
124
118
|
#=Correlation
|
125
119
|
#Gives correlation of timeseries.
|
126
|
-
#
|
127
120
|
def correlate(a, v, mode = 'full')
|
128
121
|
#peforms cross-correlation of two series
|
129
122
|
#multiarray.correlate2(a, v, 'full')
|
@@ -173,15 +166,16 @@ module Statsample
|
|
173
166
|
# But, second difference of series is NOT X(t) - X(t-2)
|
174
167
|
# It is the first difference of the first difference
|
175
168
|
# => (X(t) - X(t-1)) - (X(t-1) - X(t-2))
|
176
|
-
|
177
|
-
|
178
|
-
|
169
|
+
#==Params
|
170
|
+
#* *max_lags*: integer, (default: 1), number of differences reqd.
|
171
|
+
#==Usage
|
179
172
|
#
|
180
173
|
# ts = (1..10).map { rand }.to_ts
|
181
174
|
# # => [0.69, 0.23, 0.44, 0.71, ...]
|
182
175
|
#
|
183
176
|
# ts.diff # => [nil, -0.46, 0.21, 0.27, ...]
|
184
|
-
|
177
|
+
#==Returns
|
178
|
+
# Timeseries object
|
185
179
|
def diff(max_lags = 1)
|
186
180
|
ts = self
|
187
181
|
difference = []
|
@@ -195,10 +189,10 @@ module Statsample
|
|
195
189
|
#=Moving Average
|
196
190
|
# Calculates the moving average of the series using the provided
|
197
191
|
# lookback argument. The lookback defaults to 10 periods.
|
198
|
-
|
199
|
-
|
192
|
+
#==Parameters
|
193
|
+
#* *n*: integer, (default = 10) - loopback argument
|
200
194
|
#
|
201
|
-
|
195
|
+
#==Usage
|
202
196
|
#
|
203
197
|
# ts = (1..100).map { rand }.to_ts
|
204
198
|
# # => [0.69, 0.23, 0.44, 0.71, ...]
|
@@ -206,7 +200,7 @@ module Statsample
|
|
206
200
|
# # first 9 observations are nil
|
207
201
|
# ts.ma # => [ ... nil, 0.484... , 0.445... , 0.513 ... , ... ]
|
208
202
|
#
|
209
|
-
|
203
|
+
#==Returns
|
210
204
|
#Resulting moving average timeseries object
|
211
205
|
def ma(n = 10)
|
212
206
|
return mean if n >= size
|
@@ -226,20 +220,18 @@ module Statsample
|
|
226
220
|
# use a lot more than n observations to calculate. The series is stable
|
227
221
|
# if the size of the series is >= 3.45 * (n + 1)
|
228
222
|
#
|
229
|
-
|
230
|
-
|
231
|
-
|
232
|
-
#if false, uses 2/(n+1) value
|
233
|
-
#
|
234
|
-
#*Usage*:
|
223
|
+
#==Parameters
|
224
|
+
#* *n*: integer, (default = 10)
|
225
|
+
#* *wilder*: boolean, (default = false), if true, 1/n value is used for smoothing; if false, uses 2/(n+1) value
|
235
226
|
#
|
227
|
+
#==Usage
|
236
228
|
# ts = (1..100).map { rand }.to_ts
|
237
229
|
# # => [0.69, 0.23, 0.44, 0.71, ...]
|
238
230
|
#
|
239
231
|
# # first 9 observations are nil
|
240
232
|
# ts.ema # => [ ... nil, 0.509... , 0.433..., ... ]
|
241
233
|
#
|
242
|
-
|
234
|
+
#==Returns
|
243
235
|
#EMA timeseries
|
244
236
|
def ema(n = 10, wilder = false)
|
245
237
|
smoother = wilder ? 1.0 / n : 2.0 / (n + 1)
|
@@ -264,19 +256,18 @@ module Statsample
|
|
264
256
|
# Calculates the MACD (moving average convergence-divergence) of the time
|
265
257
|
# series - this is a comparison of a fast EMA with a slow EMA.
|
266
258
|
#
|
267
|
-
|
268
|
-
|
269
|
-
|
270
|
-
|
259
|
+
#==Parameters*:
|
260
|
+
#* *fast*: integer, (default = 12) - fast component of MACD
|
261
|
+
#* *slow*: integer, (default = 26) - slow component of MACD
|
262
|
+
#* *signal*: integer, (default = 9) - signal component of MACD
|
271
263
|
#
|
272
|
-
|
264
|
+
#==Usage
|
273
265
|
# ts = (1..100).map { rand }.to_ts
|
274
266
|
# # => [0.69, 0.23, 0.44, 0.71, ...]
|
275
267
|
# ts.macd(13)
|
276
268
|
#
|
277
|
-
|
278
|
-
# Array of two timeseries - comparison of fast EMA with slow
|
279
|
-
# and EMA with signal value
|
269
|
+
#==Returns
|
270
|
+
# Array of two timeseries - comparison of fast EMA with slow and EMA with signal value
|
280
271
|
def macd(fast = 12, slow = 26, signal = 9)
|
281
272
|
series = ema(fast) - ema(slow)
|
282
273
|
[series, series.ema(signal)]
|
@@ -15,16 +15,16 @@ module Statsample
|
|
15
15
|
|
16
16
|
|
17
17
|
#=Levinson-Durbin Algorithm
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
18
|
+
#==Parameters
|
19
|
+
#* *series*: timeseries, or a series of autocovariances
|
20
|
+
#* *nlags*: integer(default: 10): largest lag to include in recursion or order of the AR process
|
21
|
+
#* *is_acovf*: boolean(default: false): series is timeseries if it is false, else contains autocavariances
|
22
|
+
#
|
23
|
+
#==Returns:
|
24
|
+
#* *sigma_v*: estimate of the error variance
|
25
|
+
#* *arcoefs*: AR coefficients
|
26
|
+
#* *pacf*: pacf function
|
27
|
+
#* *sigma*: some function
|
28
28
|
def self.levinson_durbin(series, nlags = 10, is_acovf = false)
|
29
29
|
|
30
30
|
if is_acovf
|
@@ -60,26 +60,25 @@ module Statsample
|
|
60
60
|
return [sigma_v, arcoefs, pacf, sig, phi]
|
61
61
|
end
|
62
62
|
|
63
|
+
#Returns diagonal elements of matrices
|
64
|
+
# Will later abstract it to utilities
|
63
65
|
def self.diag(mat)
|
64
|
-
#returns array of diagonal elements of a matrix.
|
65
|
-
#will later abstract it to matrix.rb in Statsample
|
66
66
|
return mat.each_with_index(:diagonal).map { |x, r, c| x }
|
67
67
|
end
|
68
68
|
|
69
69
|
|
70
70
|
#=Yule Walker Algorithm
|
71
|
-
#From the series, estimates AR(p)(autoregressive) parameter
|
72
|
-
#using Yule-Waler equation. See -
|
71
|
+
#From the series, estimates AR(p)(autoregressive) parameter using Yule-Waler equation. See -
|
73
72
|
#http://en.wikipedia.org/wiki/Autoregressive_moving_average_model
|
74
|
-
|
75
|
-
|
76
|
-
|
77
|
-
|
78
|
-
|
79
|
-
|
80
|
-
|
81
|
-
|
82
|
-
|
73
|
+
#
|
74
|
+
#==Parameters
|
75
|
+
#* *ts*: timeseries
|
76
|
+
#* *k*: order, default = 1
|
77
|
+
#* *method*: can be 'yw' or 'mle'. If 'yw' then it is unbiased, denominator is (n - k)
|
78
|
+
#
|
79
|
+
#==Returns
|
80
|
+
#* *rho*: autoregressive coefficients
|
81
|
+
#* *sigma*: sigma parameter
|
83
82
|
def self.yule_walker(ts, k = 1, method='yw')
|
84
83
|
ts = ts - ts.mean
|
85
84
|
n = ts.size
|
@@ -110,17 +109,21 @@ module Statsample
|
|
110
109
|
return [phi, sigma]
|
111
110
|
end
|
112
111
|
|
112
|
+
#=ToEplitz
|
113
|
+
# Generates teoeplitz matrix from an array
|
114
|
+
#http://en.wikipedia.org/wiki/Toeplitz_matrix
|
115
|
+
#Toeplitz matrix are equal when they are stored in row & column major
|
116
|
+
#==Parameters
|
117
|
+
#* *arr*: array of integers;
|
118
|
+
#==Usage
|
119
|
+
# arr = [0,1,2,3]
|
120
|
+
# Pacf.toeplitz(arr)
|
121
|
+
#==Returns
|
122
|
+
# [[0, 1, 2, 3],
|
123
|
+
# [1, 0, 1, 2],
|
124
|
+
# [2, 1, 0, 1],
|
125
|
+
# [3, 2, 1, 0]]
|
113
126
|
def self.toeplitz(arr)
|
114
|
-
#Generates Toeplitz matrix -
|
115
|
-
#http://en.wikipedia.org/wiki/Toeplitz_matrix
|
116
|
-
#Toeplitz matrix are equal when they are stored in row &
|
117
|
-
#column major
|
118
|
-
#=> arr = [0, 1, 2, 3]
|
119
|
-
#=> result:
|
120
|
-
#[[0, 1, 2, 3],
|
121
|
-
# [1, 0, 1, 2],
|
122
|
-
# [2, 1, 0, 1],
|
123
|
-
# [3, 2, 1, 0]]
|
124
127
|
eplitz_matrix = Array.new(arr.size) { Array.new(arr.size) }
|
125
128
|
|
126
129
|
0.upto(arr.size - 1) do |i|
|
@@ -140,6 +143,8 @@ module Statsample
|
|
140
143
|
eplitz_matrix
|
141
144
|
end
|
142
145
|
|
146
|
+
#===Solves matrix equations
|
147
|
+
#Solves for X in AX = B
|
143
148
|
def self.solve_matrix(matrix, out_vector)
|
144
149
|
solution_vector = Array.new(out_vector.size, 0)
|
145
150
|
matrix = matrix.to_a
|
@@ -5,36 +5,35 @@ module Statsample
|
|
5
5
|
include Summarizable
|
6
6
|
|
7
7
|
#=Squares of sum
|
8
|
-
|
9
|
-
|
10
|
-
|
8
|
+
#==Parameter
|
9
|
+
#* *demean*: boolean - optional. __default__: false
|
10
|
+
#==Returns
|
11
11
|
#Sums the timeseries and then returns the square
|
12
12
|
def squares_of_sum(demean = false)
|
13
13
|
if demean
|
14
14
|
m = self.mean
|
15
|
-
self.map { |x| (x-m) }.sum
|
15
|
+
self.map { |x| (x-m) }.sum**2
|
16
16
|
else
|
17
|
-
return self.sum.to_f
|
17
|
+
return self.sum.to_f**2
|
18
18
|
end
|
19
19
|
end
|
20
20
|
end
|
21
21
|
|
22
22
|
|
23
23
|
class ::Matrix
|
24
|
-
|
25
|
-
#---
|
24
|
+
#==Squares of sum
|
26
25
|
#Does squares of sum in column order.
|
27
26
|
#Necessary for computations in various processes
|
28
27
|
def squares_of_sum
|
29
28
|
(0...column_size).map do |j|
|
30
|
-
self.column(j).sum
|
29
|
+
self.column(j).sum**2
|
31
30
|
end
|
32
31
|
end
|
33
32
|
|
34
|
-
|
35
|
-
#---
|
36
|
-
#returns bool
|
33
|
+
#==Symmetric?
|
37
34
|
#`symmetric?` is present in Ruby Matrix 1.9.3+, but not in 1.8.*
|
35
|
+
#===Returns
|
36
|
+
# bool
|
38
37
|
def symmetric?
|
39
38
|
return false unless square?
|
40
39
|
|
@@ -46,15 +45,16 @@ module Statsample
|
|
46
45
|
true
|
47
46
|
end
|
48
47
|
|
49
|
-
|
48
|
+
#==Cholesky decomposition
|
50
49
|
#Reference: http://en.wikipedia.org/wiki/Cholesky_decomposition
|
51
|
-
|
52
|
-
#==Description
|
50
|
+
#===Description
|
53
51
|
#Cholesky decomposition is reprsented by `M = L X L*`, where
|
54
52
|
#M is the symmetric matrix and `L` is the lower half of cholesky matrix,
|
55
53
|
#and `L*` is the conjugate form of `L`.
|
56
|
-
|
57
|
-
|
54
|
+
#===Returns
|
55
|
+
# Cholesky decomposition for a given matrix(if symmetric)
|
56
|
+
#===Utility
|
57
|
+
# Essential matrix function, requisite in kalman filter, least squares
|
58
58
|
def cholesky
|
59
59
|
raise ArgumentError, "Given matrix should be symmetric" unless symmetric?
|
60
60
|
c = Matrix.zero(row_size)
|
@@ -74,15 +74,16 @@ module Statsample
|
|
74
74
|
c
|
75
75
|
end
|
76
76
|
|
77
|
-
|
77
|
+
#==Chain Product
|
78
78
|
#Class method
|
79
79
|
#Returns the chain product of two matrices
|
80
|
-
|
80
|
+
#===Usage:
|
81
81
|
#Let `a` be 4 * 3 matrix,
|
82
82
|
#Let `b` be 3 * 3 matrix,
|
83
83
|
#Let `c` be 3 * 1 matrix,
|
84
84
|
#then `Matrix.chain_dot(a, b, c)`
|
85
|
-
|
85
|
+
#===NOTE:
|
86
|
+
# Send the matrices in multiplicative order with proper dimensions
|
86
87
|
def self.chain_dot(*args)
|
87
88
|
#inspired by Statsmodels
|
88
89
|
begin
|
@@ -93,7 +94,7 @@ module Statsample
|
|
93
94
|
end
|
94
95
|
|
95
96
|
|
96
|
-
|
97
|
+
#==Adds a column of constants.
|
97
98
|
#Appends a column of ones to the matrix/array if first argument is false
|
98
99
|
#If an n-array, first checks if one column of ones is already present
|
99
100
|
#if present, then original(self) is returned, else, prepends with a vector of ones
|
@@ -115,6 +116,7 @@ module Statsample
|
|
115
116
|
return Matrix.rows(vectors)
|
116
117
|
end
|
117
118
|
|
119
|
+
#populates column i of given matrix with arr
|
118
120
|
def set_column(i, arr)
|
119
121
|
columns = self.column_vectors
|
120
122
|
column = columns[i].to_a
|
@@ -122,7 +124,8 @@ module Statsample
|
|
122
124
|
columns[i] = column
|
123
125
|
return Matrix.columns(columns)
|
124
126
|
end
|
125
|
-
|
127
|
+
|
128
|
+
#populates row i of given matrix with arr
|
126
129
|
def set_row(i, arr)
|
127
130
|
#similar implementation as set_column
|
128
131
|
#writing and commenting metaprogrammed version
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bio-statsample-timeseries
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.2.
|
4
|
+
version: 0.2.1
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -10,7 +10,7 @@ authors:
|
|
10
10
|
autorequire:
|
11
11
|
bindir: bin
|
12
12
|
cert_chain: []
|
13
|
-
date: 2013-09-
|
13
|
+
date: 2013-09-23 00:00:00.000000000 Z
|
14
14
|
dependencies:
|
15
15
|
- !ruby/object:Gem::Dependency
|
16
16
|
name: statsample
|
@@ -260,7 +260,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
260
260
|
version: '0'
|
261
261
|
segments:
|
262
262
|
- 0
|
263
|
-
hash:
|
263
|
+
hash: -258712685
|
264
264
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
265
265
|
none: false
|
266
266
|
requirements:
|