statarray 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (5) hide show
  1. data/AUTHORS +2 -0
  2. data/GPL +340 -0
  3. data/README +14 -0
  4. data/lib/statarray.rb +285 -0
  5. metadata +43 -0
data/AUTHORS ADDED
@@ -0,0 +1,2 @@
1
+ Daniel Cutting
2
+ David Symonds
data/GPL ADDED
@@ -0,0 +1,340 @@
1
+ GNU GENERAL PUBLIC LICENSE
2
+ Version 2, June 1991
3
+
4
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.
5
+ 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
6
+ Everyone is permitted to copy and distribute verbatim copies
7
+ of this license document, but changing it is not allowed.
8
+
9
+ Preamble
10
+
11
+ The licenses for most software are designed to take away your
12
+ freedom to share and change it. By contrast, the GNU General Public
13
+ License is intended to guarantee your freedom to share and change free
14
+ software--to make sure the software is free for all its users. This
15
+ General Public License applies to most of the Free Software
16
+ Foundation's software and to any other program whose authors commit to
17
+ using it. (Some other Free Software Foundation software is covered by
18
+ the GNU Library General Public License instead.) You can apply it to
19
+ your programs, too.
20
+
21
+ When we speak of free software, we are referring to freedom, not
22
+ price. Our General Public Licenses are designed to make sure that you
23
+ have the freedom to distribute copies of free software (and charge for
24
+ this service if you wish), that you receive source code or can get it
25
+ if you want it, that you can change the software or use pieces of it
26
+ in new free programs; and that you know you can do these things.
27
+
28
+ To protect your rights, we need to make restrictions that forbid
29
+ anyone to deny you these rights or to ask you to surrender the rights.
30
+ These restrictions translate to certain responsibilities for you if you
31
+ distribute copies of the software, or if you modify it.
32
+
33
+ For example, if you distribute copies of such a program, whether
34
+ gratis or for a fee, you must give the recipients all the rights that
35
+ you have. You must make sure that they, too, receive or can get the
36
+ source code. And you must show them these terms so they know their
37
+ rights.
38
+
39
+ We protect your rights with two steps: (1) copyright the software, and
40
+ (2) offer you this license which gives you legal permission to copy,
41
+ distribute and/or modify the software.
42
+
43
+ Also, for each author's protection and ours, we want to make certain
44
+ that everyone understands that there is no warranty for this free
45
+ software. If the software is modified by someone else and passed on, we
46
+ want its recipients to know that what they have is not the original, so
47
+ that any problems introduced by others will not reflect on the original
48
+ authors' reputations.
49
+
50
+ Finally, any free program is threatened constantly by software
51
+ patents. We wish to avoid the danger that redistributors of a free
52
+ program will individually obtain patent licenses, in effect making the
53
+ program proprietary. To prevent this, we have made it clear that any
54
+ patent must be licensed for everyone's free use or not licensed at all.
55
+
56
+ The precise terms and conditions for copying, distribution and
57
+ modification follow.
58
+
59
+ GNU GENERAL PUBLIC LICENSE
60
+ TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
61
+
62
+ 0. This License applies to any program or other work which contains
63
+ a notice placed by the copyright holder saying it may be distributed
64
+ under the terms of this General Public License. The "Program", below,
65
+ refers to any such program or work, and a "work based on the Program"
66
+ means either the Program or any derivative work under copyright law:
67
+ that is to say, a work containing the Program or a portion of it,
68
+ either verbatim or with modifications and/or translated into another
69
+ language. (Hereinafter, translation is included without limitation in
70
+ the term "modification".) Each licensee is addressed as "you".
71
+
72
+ Activities other than copying, distribution and modification are not
73
+ covered by this License; they are outside its scope. The act of
74
+ running the Program is not restricted, and the output from the Program
75
+ is covered only if its contents constitute a work based on the
76
+ Program (independent of having been made by running the Program).
77
+ Whether that is true depends on what the Program does.
78
+
79
+ 1. You may copy and distribute verbatim copies of the Program's
80
+ source code as you receive it, in any medium, provided that you
81
+ conspicuously and appropriately publish on each copy an appropriate
82
+ copyright notice and disclaimer of warranty; keep intact all the
83
+ notices that refer to this License and to the absence of any warranty;
84
+ and give any other recipients of the Program a copy of this License
85
+ along with the Program.
86
+
87
+ You may charge a fee for the physical act of transferring a copy, and
88
+ you may at your option offer warranty protection in exchange for a fee.
89
+
90
+ 2. You may modify your copy or copies of the Program or any portion
91
+ of it, thus forming a work based on the Program, and copy and
92
+ distribute such modifications or work under the terms of Section 1
93
+ above, provided that you also meet all of these conditions:
94
+
95
+ a) You must cause the modified files to carry prominent notices
96
+ stating that you changed the files and the date of any change.
97
+
98
+ b) You must cause any work that you distribute or publish, that in
99
+ whole or in part contains or is derived from the Program or any
100
+ part thereof, to be licensed as a whole at no charge to all third
101
+ parties under the terms of this License.
102
+
103
+ c) If the modified program normally reads commands interactively
104
+ when run, you must cause it, when started running for such
105
+ interactive use in the most ordinary way, to print or display an
106
+ announcement including an appropriate copyright notice and a
107
+ notice that there is no warranty (or else, saying that you provide
108
+ a warranty) and that users may redistribute the program under
109
+ these conditions, and telling the user how to view a copy of this
110
+ License. (Exception: if the Program itself is interactive but
111
+ does not normally print such an announcement, your work based on
112
+ the Program is not required to print an announcement.)
113
+
114
+ These requirements apply to the modified work as a whole. If
115
+ identifiable sections of that work are not derived from the Program,
116
+ and can be reasonably considered independent and separate works in
117
+ themselves, then this License, and its terms, do not apply to those
118
+ sections when you distribute them as separate works. But when you
119
+ distribute the same sections as part of a whole which is a work based
120
+ on the Program, the distribution of the whole must be on the terms of
121
+ this License, whose permissions for other licensees extend to the
122
+ entire whole, and thus to each and every part regardless of who wrote it.
123
+
124
+ Thus, it is not the intent of this section to claim rights or contest
125
+ your rights to work written entirely by you; rather, the intent is to
126
+ exercise the right to control the distribution of derivative or
127
+ collective works based on the Program.
128
+
129
+ In addition, mere aggregation of another work not based on the Program
130
+ with the Program (or with a work based on the Program) on a volume of
131
+ a storage or distribution medium does not bring the other work under
132
+ the scope of this License.
133
+
134
+ 3. You may copy and distribute the Program (or a work based on it,
135
+ under Section 2) in object code or executable form under the terms of
136
+ Sections 1 and 2 above provided that you also do one of the following:
137
+
138
+ a) Accompany it with the complete corresponding machine-readable
139
+ source code, which must be distributed under the terms of Sections
140
+ 1 and 2 above on a medium customarily used for software interchange; or,
141
+
142
+ b) Accompany it with a written offer, valid for at least three
143
+ years, to give any third party, for a charge no more than your
144
+ cost of physically performing source distribution, a complete
145
+ machine-readable copy of the corresponding source code, to be
146
+ distributed under the terms of Sections 1 and 2 above on a medium
147
+ customarily used for software interchange; or,
148
+
149
+ c) Accompany it with the information you received as to the offer
150
+ to distribute corresponding source code. (This alternative is
151
+ allowed only for noncommercial distribution and only if you
152
+ received the program in object code or executable form with such
153
+ an offer, in accord with Subsection b above.)
154
+
155
+ The source code for a work means the preferred form of the work for
156
+ making modifications to it. For an executable work, complete source
157
+ code means all the source code for all modules it contains, plus any
158
+ associated interface definition files, plus the scripts used to
159
+ control compilation and installation of the executable. However, as a
160
+ special exception, the source code distributed need not include
161
+ anything that is normally distributed (in either source or binary
162
+ form) with the major components (compiler, kernel, and so on) of the
163
+ operating system on which the executable runs, unless that component
164
+ itself accompanies the executable.
165
+
166
+ If distribution of executable or object code is made by offering
167
+ access to copy from a designated place, then offering equivalent
168
+ access to copy the source code from the same place counts as
169
+ distribution of the source code, even though third parties are not
170
+ compelled to copy the source along with the object code.
171
+
172
+ 4. You may not copy, modify, sublicense, or distribute the Program
173
+ except as expressly provided under this License. Any attempt
174
+ otherwise to copy, modify, sublicense or distribute the Program is
175
+ void, and will automatically terminate your rights under this License.
176
+ However, parties who have received copies, or rights, from you under
177
+ this License will not have their licenses terminated so long as such
178
+ parties remain in full compliance.
179
+
180
+ 5. You are not required to accept this License, since you have not
181
+ signed it. However, nothing else grants you permission to modify or
182
+ distribute the Program or its derivative works. These actions are
183
+ prohibited by law if you do not accept this License. Therefore, by
184
+ modifying or distributing the Program (or any work based on the
185
+ Program), you indicate your acceptance of this License to do so, and
186
+ all its terms and conditions for copying, distributing or modifying
187
+ the Program or works based on it.
188
+
189
+ 6. Each time you redistribute the Program (or any work based on the
190
+ Program), the recipient automatically receives a license from the
191
+ original licensor to copy, distribute or modify the Program subject to
192
+ these terms and conditions. You may not impose any further
193
+ restrictions on the recipients' exercise of the rights granted herein.
194
+ You are not responsible for enforcing compliance by third parties to
195
+ this License.
196
+
197
+ 7. If, as a consequence of a court judgment or allegation of patent
198
+ infringement or for any other reason (not limited to patent issues),
199
+ conditions are imposed on you (whether by court order, agreement or
200
+ otherwise) that contradict the conditions of this License, they do not
201
+ excuse you from the conditions of this License. If you cannot
202
+ distribute so as to satisfy simultaneously your obligations under this
203
+ License and any other pertinent obligations, then as a consequence you
204
+ may not distribute the Program at all. For example, if a patent
205
+ license would not permit royalty-free redistribution of the Program by
206
+ all those who receive copies directly or indirectly through you, then
207
+ the only way you could satisfy both it and this License would be to
208
+ refrain entirely from distribution of the Program.
209
+
210
+ If any portion of this section is held invalid or unenforceable under
211
+ any particular circumstance, the balance of the section is intended to
212
+ apply and the section as a whole is intended to apply in other
213
+ circumstances.
214
+
215
+ It is not the purpose of this section to induce you to infringe any
216
+ patents or other property right claims or to contest validity of any
217
+ such claims; this section has the sole purpose of protecting the
218
+ integrity of the free software distribution system, which is
219
+ implemented by public license practices. Many people have made
220
+ generous contributions to the wide range of software distributed
221
+ through that system in reliance on consistent application of that
222
+ system; it is up to the author/donor to decide if he or she is willing
223
+ to distribute software through any other system and a licensee cannot
224
+ impose that choice.
225
+
226
+ This section is intended to make thoroughly clear what is believed to
227
+ be a consequence of the rest of this License.
228
+
229
+ 8. If the distribution and/or use of the Program is restricted in
230
+ certain countries either by patents or by copyrighted interfaces, the
231
+ original copyright holder who places the Program under this License
232
+ may add an explicit geographical distribution limitation excluding
233
+ those countries, so that distribution is permitted only in or among
234
+ countries not thus excluded. In such case, this License incorporates
235
+ the limitation as if written in the body of this License.
236
+
237
+ 9. The Free Software Foundation may publish revised and/or new versions
238
+ of the General Public License from time to time. Such new versions will
239
+ be similar in spirit to the present version, but may differ in detail to
240
+ address new problems or concerns.
241
+
242
+ Each version is given a distinguishing version number. If the Program
243
+ specifies a version number of this License which applies to it and "any
244
+ later version", you have the option of following the terms and conditions
245
+ either of that version or of any later version published by the Free
246
+ Software Foundation. If the Program does not specify a version number of
247
+ this License, you may choose any version ever published by the Free Software
248
+ Foundation.
249
+
250
+ 10. If you wish to incorporate parts of the Program into other free
251
+ programs whose distribution conditions are different, write to the author
252
+ to ask for permission. For software which is copyrighted by the Free
253
+ Software Foundation, write to the Free Software Foundation; we sometimes
254
+ make exceptions for this. Our decision will be guided by the two goals
255
+ of preserving the free status of all derivatives of our free software and
256
+ of promoting the sharing and reuse of software generally.
257
+
258
+ NO WARRANTY
259
+
260
+ 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
261
+ FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
262
+ OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
263
+ PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
264
+ OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
265
+ MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
266
+ TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
267
+ PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
268
+ REPAIR OR CORRECTION.
269
+
270
+ 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
271
+ WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
272
+ REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
273
+ INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
274
+ OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
275
+ TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
276
+ YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
277
+ PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
278
+ POSSIBILITY OF SUCH DAMAGES.
279
+
280
+ END OF TERMS AND CONDITIONS
281
+
282
+ How to Apply These Terms to Your New Programs
283
+
284
+ If you develop a new program, and you want it to be of the greatest
285
+ possible use to the public, the best way to achieve this is to make it
286
+ free software which everyone can redistribute and change under these terms.
287
+
288
+ To do so, attach the following notices to the program. It is safest
289
+ to attach them to the start of each source file to most effectively
290
+ convey the exclusion of warranty; and each file should have at least
291
+ the "copyright" line and a pointer to where the full notice is found.
292
+
293
+ <one line to give the program's name and a brief idea of what it does.>
294
+ Copyright (C) <year> <name of author>
295
+
296
+ This program is free software; you can redistribute it and/or modify
297
+ it under the terms of the GNU General Public License as published by
298
+ the Free Software Foundation; either version 2 of the License, or
299
+ (at your option) any later version.
300
+
301
+ This program is distributed in the hope that it will be useful,
302
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
303
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
304
+ GNU General Public License for more details.
305
+
306
+ You should have received a copy of the GNU General Public License
307
+ along with this program; if not, write to the Free Software
308
+ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
309
+
310
+
311
+ Also add information on how to contact you by electronic and paper mail.
312
+
313
+ If the program is interactive, make it output a short notice like this
314
+ when it starts in an interactive mode:
315
+
316
+ Gnomovision version 69, Copyright (C) year name of author
317
+ Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
318
+ This is free software, and you are welcome to redistribute it
319
+ under certain conditions; type `show c' for details.
320
+
321
+ The hypothetical commands `show w' and `show c' should show the appropriate
322
+ parts of the General Public License. Of course, the commands you use may
323
+ be called something other than `show w' and `show c'; they could even be
324
+ mouse-clicks or menu items--whatever suits your program.
325
+
326
+ You should also get your employer (if you work as a programmer) or your
327
+ school, if any, to sign a "copyright disclaimer" for the program, if
328
+ necessary. Here is a sample; alter the names:
329
+
330
+ Yoyodyne, Inc., hereby disclaims all copyright interest in the program
331
+ `Gnomovision' (which makes passes at compilers) written by James Hacker.
332
+
333
+ <signature of Ty Coon>, 1 April 1989
334
+ Ty Coon, President of Vice
335
+
336
+ This General Public License does not permit incorporating your program into
337
+ proprietary programs. If your program is a subroutine library, you may
338
+ consider it more useful to permit linking proprietary applications with the
339
+ library. If this is what you want to do, use the GNU Library General
340
+ Public License instead of this License.
data/README ADDED
@@ -0,0 +1,14 @@
1
+ StatArray 0.0.1
2
+ Soyabean Software Pty Ltd
3
+
4
+ StatArray is a simple way of calculating statistics about an array of data.
5
+ Originally part of the ROMNeT suite.
6
+
7
+ Authors: Dan Cutting <dcutting@soyabean.com.au>
8
+ David Symonds <dsymonds@gmail.com>
9
+
10
+ Copyright (C) 2005-2006 Soyabean Software Pty Ltd <http://www.soyabean.com.au>
11
+
12
+ You should have received a copy of the GNU General Public License along with
13
+ Risc; if not, write to the Free Software Foundation, Inc., 51 Franklin St,
14
+ Fifth Floor, Boston, MA 02110-1301 USA
@@ -0,0 +1,285 @@
1
+ module Math
2
+ def Math::float_equal(a,b)
3
+ c = a-b
4
+ c *= -1.0 if c < 0
5
+ c < 0.000000001 # TODO: how should we pick epsilon?
6
+ end
7
+ end
8
+
9
+ class StatArray < Array
10
+ alias :count size
11
+
12
+ def sum
13
+ inject(0) { |sum, x| sum + x }
14
+ end
15
+
16
+ def mean
17
+ return 0.0 if self.size == 0
18
+ sum.to_f / self.size
19
+ end
20
+ alias :arithmetic_mean :mean
21
+
22
+ def median
23
+ return 0 if self.size == 0
24
+ tmp = sort
25
+ mid = tmp.size / 2
26
+ if (tmp.size % 2) == 0
27
+ (tmp[mid-1] + tmp[mid]).to_f / 2
28
+ else
29
+ tmp[mid]
30
+ end
31
+ end
32
+
33
+ # The sum of the squared deviations from the mean.
34
+ def summed_sqdevs
35
+ return 0 if count < 2
36
+ m = mean
37
+ StatArray.new(map { |x| (x - m) ** 2 }).sum
38
+ end
39
+
40
+ # Variance of the sample.
41
+ def variance
42
+ # Variance of 0 or 1 elements is 0.0
43
+ return 0.0 if count < 2
44
+ summed_sqdevs / (count - 1)
45
+ end
46
+
47
+ # Variance of a population.
48
+ def pvariance
49
+ # Variance of 0 or 1 elements is 0.0
50
+ return 0.0 if count < 2
51
+ summed_sqdevs / count
52
+ end
53
+
54
+ # Standard deviation of a sample.
55
+ def stddev
56
+ Math::sqrt(variance)
57
+ end
58
+
59
+ # Standard deviation of a population.
60
+ def pstddev
61
+ Math::sqrt(pvariance)
62
+ end
63
+
64
+ # Calculates the standard error of this sample.
65
+ def stderr
66
+ return 0.0 if count < 2
67
+ stddev/Math::sqrt(size)
68
+ end
69
+
70
+ # Returns the confidence interval for this sample as [lower,upper].
71
+ # doc can be 90, 95, 99 or 999, defaulting to 95.
72
+ def ci(doc = 95)
73
+ limit = climit(doc)
74
+ [mean-limit,mean+limit]
75
+ end
76
+
77
+ # Returns E, the error associated with this sample for the given degree of
78
+ # confidence.
79
+ def climit(doc = 95)
80
+ TTable::t(doc,count)*stderr
81
+ end
82
+
83
+ # Calculates the relative mean difference of this sample.
84
+ # Makes use of the fact that the Gini Coefficient is half the RMD.
85
+ def relative_mean_difference
86
+ return 0.0 if Math::float_equal(mean,0.0)
87
+ gini_coefficient * 2
88
+ end
89
+ alias :rmd :relative_mean_difference
90
+
91
+ # The average absolute difference of two independent values drawn from
92
+ # the sample. Equal to the RMD * the mean.
93
+ def mean_difference
94
+ relative_mean_difference * mean
95
+ end
96
+ alias :absolute_mean_difference :mean_difference
97
+ alias :md :mean_difference
98
+
99
+ # One of the Pearson skewness measures of this sample.
100
+ def pearson_skewness2
101
+ 3*(mean-median)/stddev
102
+ end
103
+
104
+ # The skewness of this sample.
105
+ def skewness
106
+ fail "Buggy"
107
+ return 0.0 if count < 2
108
+ m = mean
109
+ s = inject(0) { |sum,xi| sum+(xi-m)**3 }
110
+ s.to_f/(count*variance**(3/2))
111
+ end
112
+
113
+ # The kurtosis of this sample.
114
+ def kurtosis
115
+ fail "Buggy"
116
+ return 0.0 if count < 2
117
+ m = mean
118
+ s = 0
119
+ each { |xi| s += (xi-m)**4 }
120
+ (s.to_f/((count-1)*variance**2))-3
121
+ end
122
+
123
+ # Calculates the Theil index (a statistic used to measure economic
124
+ # inequality). http://en.wikipedia.org/wiki/Theil_index
125
+ # TI = \sum_{i=1}^N \frac{x_i}{\sum_{j=1}^N x_j} ln \frac{x_i}{\bar{x}}
126
+ def theil_index
127
+ return -1 if count <= 0 or any? { |x| x < 0 }
128
+ return 0 if count < 2 or all? { |x| Math::float_equal(x,0) }
129
+ m = mean
130
+ s = sum.to_f
131
+ inject(0) do |theil,xi|
132
+ theil + ((xi > 0) ? (Math::log(xi.to_f/m) * xi.to_f/s) : 0.0)
133
+ end
134
+ end
135
+
136
+ # Closely related to the Theil index and easily expressible in terms of it.
137
+ # http://en.wikipedia.org/wiki/Atkinson_index
138
+ # AI = 1-e^{theil_index}
139
+ def atkinson_index
140
+ t = theil_index
141
+ (t < 0) ? -1 : 1-Math::E**(-t)
142
+ end
143
+
144
+ # Calculates the Gini Coefficient (a measure of inequality of a distribution
145
+ # based on the area between the Lorenz curve and the uniform curve).
146
+ # http://en.wikipedia.org/wiki/Gini_coefficient
147
+ # GC = \frac{1}{N} \left ( N+1-2\frac{\sum_{i=1}^N (N+1-i)y_i}{\sum_{i=1}^N y_i} \right )
148
+ def gini_coefficient2
149
+ return -1 if count <= 0 or any? { |x| x < 0 }
150
+ return 0 if count < 2 or all? { |x| Math::float_equal(x,0) }
151
+ s = 0
152
+ sort.each_with_index { |yi,i| s += (size - i)*yi }
153
+ (size+1-2*(s.to_f/sum.to_f)).to_f/size.to_f
154
+ end
155
+
156
+ # Slightly cleaner way of calculating the Gini Coefficient. Any quicker?
157
+ # GC = \frac{\sum_{i=1}^N (2i-N-1)x_i}{N^2-\bar{x}}
158
+ def gini_coefficient
159
+ return -1 if count <= 0 or any? { |x| x < 0 }
160
+ return 0 if count < 2 or all? { |x| Math::float_equal(x,0) }
161
+ s = 0
162
+ sort.each_with_index { |li,i| s += (2*i+1-size)*li }
163
+ s.to_f/(size**2*mean).to_f
164
+ end
165
+
166
+ # The KL-divergence from this array to that of q.
167
+ # NB: You will possibly want to sort both P and Q before calling this
168
+ # depending on what you're actually trying to measure.
169
+ # http://en.wikipedia.org/wiki/Kullback-Leibler_divergence
170
+ def kullback_leibler_divergence(q)
171
+ fail "Buggy."
172
+ fail "Cannot compare differently sized arrays." unless size = q.size
173
+ kld = 0
174
+ each_with_index { |pi,i| kld += pi*Math::log(pi.to_f/q[i].to_f) }
175
+ kld
176
+ end
177
+
178
+ # Returns the Cumulative Density Function of this sample (normalised to a fraction of 1.0).
179
+ def cdf(normalised = 1.0)
180
+ s = sum.to_f
181
+ sort.inject([0.0]) { |c,d| c << c[-1] + normalised*d.to_f/s }
182
+ end
183
+
184
+ def stats
185
+ if size != 0
186
+ return %Q/#{"%12.2f" % sum} #{"%12.2f" % average} #{"%12.2f" % stddev} #{"%12.2f" % min} #{"%12.2f" % max} #{"%12.2f" % median} #{"%12.2f" % size}/
187
+ else
188
+ return %Q/<error>/
189
+ end
190
+ end
191
+
192
+ def to_stats
193
+ { :sum => sum, :mean => mean, :stddev => stddev, :min => min, :max => max, :median => median, :count => size }
194
+ end
195
+
196
+ def StatArray.stats_header
197
+ %Q/#{"%12s" % "Sum"} #{"%12s" % "Avg."} #{"%12s" % "Std.dev."} #{"%12s" % "Min."} #{"%12s" % "Max."} #{"%12s" % "Median"} #{"%12s" % "Count"}/
198
+ end
199
+ end
200
+
201
+ class TTable
202
+ # Format of rawtvalues:
203
+ # DegreesOfFreedom 90% 95% 99% 99.9%
204
+ @@rawtvalues = <<EOF
205
+ 1 6.31 12.71 63.66 636.62
206
+ 2 2.92 4.30 9.93 31.60
207
+ 3 2.35 3.18 5.84 12.92
208
+ 4 2.13 2.78 4.60 8.61
209
+ 5 2.02 2.57 4.03 6.87
210
+ 6 1.94 2.45 3.71 5.96
211
+ 7 1.89 2.37 3.50 5.41
212
+ 8 1.86 2.31 3.36 5.04
213
+ 9 1.83 2.26 3.25 4.78
214
+ 10 1.81 2.23 3.17 4.59
215
+ 11 1.80 2.20 3.11 4.44
216
+ 12 1.78 2.18 3.06 4.32
217
+ 13 1.77 2.16 3.01 4.22
218
+ 14 1.76 2.14 2.98 4.14
219
+ 15 1.75 2.13 2.95 4.07
220
+ 16 1.75 2.12 2.92 4.02
221
+ 17 1.74 2.11 2.90 3.97
222
+ 18 1.73 2.10 2.88 3.92
223
+ 19 1.73 2.09 2.86 3.88
224
+ 20 1.72 2.09 2.85 3.85
225
+ 21 1.72 2.08 2.83 3.82
226
+ 22 1.72 2.07 2.82 3.79
227
+ 23 1.71 2.07 2.82 3.77
228
+ 24 1.71 2.06 2.80 3.75
229
+ 25 1.71 2.06 2.79 3.73
230
+ 26 1.71 2.06 2.78 3.71
231
+ 27 1.70 2.05 2.77 3.69
232
+ 28 1.70 2.05 2.76 3.67
233
+ 29 1.70 2.05 2.76 3.66
234
+ 30 1.64 1.96 2.58 3.29
235
+ EOF
236
+ @@tvalues = nil
237
+
238
+ def TTable::parseTValues
239
+ @@tvalues = Array.new
240
+ @@rawtvalues.split(/\n/).each do |row|
241
+ @@tvalues << row.split(/\s+/).map { |i| i.to_f }
242
+ end
243
+ end
244
+
245
+ def TTable.t(dc,samples = 31)
246
+ fail ArgumentError.new("Need at least 2 samples to find a t-value.") if samples < 2
247
+ samples = 31 if samples > 31
248
+ case dc
249
+ when 90
250
+ dci = 1
251
+ when 95
252
+ dci = 2
253
+ when 99
254
+ dci = 3
255
+ when 999
256
+ dci = 4
257
+ else
258
+ fail ArgumentError.new("Cannot calculate t-value for #{dc}% degree of confidence.")
259
+ end
260
+ TTable::parseTValues unless @@tvalues
261
+ @@tvalues[samples-1-1][dci]
262
+ end
263
+
264
+ def TTable.t90(samples = 31)
265
+ TTable::t(90,samples)
266
+ end
267
+
268
+ def TTable.t95(samples = 31)
269
+ TTable::t(95,samples)
270
+ end
271
+
272
+ def TTable.t99(samples = 31)
273
+ TTable::t(99,samples)
274
+ end
275
+
276
+ def TTable.t999(samples = 31)
277
+ TTable::t(999,samples)
278
+ end
279
+ end
280
+
281
+ class Array
282
+ def to_statarray
283
+ StatArray.new(self)
284
+ end
285
+ end
metadata ADDED
@@ -0,0 +1,43 @@
1
+ --- !ruby/object:Gem::Specification
2
+ rubygems_version: 0.8.10
3
+ specification_version: 1
4
+ name: statarray
5
+ version: !ruby/object:Gem::Version
6
+ version: 0.0.1
7
+ date: 2006-09-01
8
+ summary: StatArray is a simple way of calculating statistics about an array of data.
9
+ require_paths:
10
+ - lib
11
+ email: dcutting@soyabean.com.au
12
+ homepage: http://soyabean.com.au
13
+ rubyforge_project:
14
+ description:
15
+ autorequire: statarray
16
+ default_executable:
17
+ bindir: bin
18
+ has_rdoc: true
19
+ required_ruby_version: !ruby/object:Gem::Version::Requirement
20
+ requirements:
21
+ -
22
+ - ">"
23
+ - !ruby/object:Gem::Version
24
+ version: 0.0.0
25
+ version:
26
+ platform: ruby
27
+ authors:
28
+ - Dan Cutting
29
+ files:
30
+ - lib/statarray.rb
31
+ - README
32
+ - AUTHORS
33
+ - GPL
34
+ test_files: []
35
+ rdoc_options: []
36
+ extra_rdoc_files:
37
+ - README
38
+ - AUTHORS
39
+ - GPL
40
+ executables: []
41
+ extensions: []
42
+ requirements: []
43
+ dependencies: []