experiment 0.0.1
Sign up to get free protection for your applications and to get access to all the features.
- data/History.txt +4 -0
- data/Manifest.txt +15 -0
- data/README.rdoc +123 -0
- data/Rakefile +26 -0
- data/bin/experiment +277 -0
- data/lib/experiment.rb +6 -0
- data/lib/experiment/base.rb +162 -0
- data/lib/experiment/config.rb +40 -0
- data/lib/experiment/generator/Rakefile +17 -0
- data/lib/experiment/generator/experiment_template.rb +35 -0
- data/lib/experiment/generator/readme_template.txt +24 -0
- data/lib/experiment/notify.rb +6 -0
- data/lib/experiment/stats.rb +43 -0
- data/test/test_experiment.rb +11 -0
- data/test/test_helper.rb +3 -0
- metadata +112 -0
data/History.txt
ADDED
data/Manifest.txt
ADDED
@@ -0,0 +1,15 @@
|
|
1
|
+
History.txt
|
2
|
+
Manifest.txt
|
3
|
+
README.rdoc
|
4
|
+
Rakefile
|
5
|
+
lib/experiment.rb
|
6
|
+
lib/experiment/config.rb
|
7
|
+
lib/experiment/stats.rb
|
8
|
+
lib/experiment/generator/readme_template.txt
|
9
|
+
lib/experiment/generator/experiment_template.rb
|
10
|
+
lib/experiment/generator/Rakefile
|
11
|
+
lib/experiment/base.rb
|
12
|
+
lib/experiment/notify.rb
|
13
|
+
test/test_experiment.rb
|
14
|
+
test/test_helper.rb
|
15
|
+
bin/experiment
|
data/README.rdoc
ADDED
@@ -0,0 +1,123 @@
|
|
1
|
+
= Experiment
|
2
|
+
* http://github.com/gampleman/experiment
|
3
|
+
|
4
|
+
== What's it about?
|
5
|
+
|
6
|
+
Experiment is a ruby library and environment for running scientific experiments (eg. AI, GA...), especially good for experiments in optimizing results by variations in algorithm or parameters.
|
7
|
+
|
8
|
+
== Installation
|
9
|
+
|
10
|
+
$ sudo gem install experiment
|
11
|
+
|
12
|
+
== Getting started
|
13
|
+
|
14
|
+
Experiment is modeled after rails and the workflow should be recognizable enough.
|
15
|
+
|
16
|
+
First start by generating your project:
|
17
|
+
|
18
|
+
$ experiment new my_project
|
19
|
+
|
20
|
+
This will create several files and directories. We will shortly introduce you to these.
|
21
|
+
|
22
|
+
First off is the `app` directory. This is where a basic implementation of what you mean to do. You can write your code however you want, just make sure the code is well structured - you will be overriding this later in your experiments.
|
23
|
+
|
24
|
+
== Setting up an experiment
|
25
|
+
|
26
|
+
Experiments are set up in the experiments directory. The first thing you need to do is define what consist an experiment in your case. For this open up the file `experiments/experiment.rb`. You will notice that this file contains a bunch of comments and a stub letting you easily understand what to do.
|
27
|
+
|
28
|
+
For a typical experiment you will need to do some setup work (eg. initialize your classes, calculate parametres, etc.), run the experiment and maybe do cleanup (remove temp. files).
|
29
|
+
|
30
|
+
You do all this work in the `run_the_experiment` method. Remember to pass the raw output via `<<` to the `output` variable and wrap the experiment in a benchmark block. This will be automatically saved to the results directory for further analysis.
|
31
|
+
|
32
|
+
The `test_data` method lets you specify an array of data points that you want split for cross-validation (see below). This will be passed to in `run_the_experiment` in the input variable.
|
33
|
+
|
34
|
+
Next you may want to analyze the data you got. For that there is the `analyze_result!` method which has 2 arguments. One is the raw data file that was output by your code and the other is the path to an expected output file (this can be very rich in detail, ideal for confusion matrices and the like). The method should return a hash of summary results (eg. `:total_performance => 16`).
|
35
|
+
|
36
|
+
All of this will be also saved to disk and available for later analysis.
|
37
|
+
|
38
|
+
== Creating an experimental condition
|
39
|
+
|
40
|
+
Now to get to making different conditions and measuring them. First call
|
41
|
+
|
42
|
+
$ experiment generate my_condition -m "This should be a description of what you plan to\
|
43
|
+
do, maybe including a hypothesis. Don't worry, you can edit this later."
|
44
|
+
|
45
|
+
This will create a directory in `experiments` based on the name you provide (in this case `experiments/my_condition`). In this directory you will find a class that inherits from the experiment you defined earlier and that it also explicitly requires all the files you wrote in `app`. This gives you the flexibility to delete any of these includes and create a copy of that file to modify it. It also allows you to override the experiment logic as needed.
|
46
|
+
|
47
|
+
Also notice that the description you provided is stored as a comment in that file. You can expand your hypothesis as you work on the file and it will be included in your report automatically.
|
48
|
+
|
49
|
+
== Running the experiment
|
50
|
+
|
51
|
+
Once you make the desired changes you can run the experiment with:
|
52
|
+
|
53
|
+
$ experiment run my_condition --cv 5
|
54
|
+
|
55
|
+
This will create a directory in `results` named something like `my_condition-cv5-46424`. The naming convention is to give the condition name, a summary of the configuration used, and a shortened timestamp to differentiate reruns of the same experiment.
|
56
|
+
|
57
|
+
The experimental results and benchmarks will be written to this directory with a specification yaml file that details all conditions of the experiment.
|
58
|
+
|
59
|
+
Please notice that you can provide several different conditions to the run command and it will run them sequentially, all with required options.
|
60
|
+
|
61
|
+
== Configuration
|
62
|
+
|
63
|
+
So far we have been talking mainly about variations in the source code of the experiments. But what if you just want to tweak a few parameters? There is always the almighty *Config* class to the rescue.
|
64
|
+
|
65
|
+
Experiment::Config[:my_config_variable] # anywhere in your code
|
66
|
+
|
67
|
+
Where does this come from? You have a config directory containing a `config.yaml` file. This file contains several environments. The idea is that you might want to tweak your options differently when running on your laptop then when running on a university supercomputer ;)
|
68
|
+
|
69
|
+
`development` is the default environment, you can set any other with the `--env` option.
|
70
|
+
|
71
|
+
Notice that also when generating an experimental condition it also gets its own config.yaml file. This file overrides the main config file so you can introduce in condition specific options.
|
72
|
+
|
73
|
+
And finally when running an experiment you can use the -o or --options option to override any config you want.
|
74
|
+
|
75
|
+
Let me give you an example. Your main config file looks like this:
|
76
|
+
|
77
|
+
environments:
|
78
|
+
development:
|
79
|
+
ref_dir: /Users/kubowo/Desktop/points-vals
|
80
|
+
master_dir: /Users/kubowo/Desktop/points-vals/s014
|
81
|
+
alpha: 0.4
|
82
|
+
compute:
|
83
|
+
ref_dir: /afs/group/DB/points
|
84
|
+
master_dir: /afs/group/DB/points/s145
|
85
|
+
alpha: 0.4
|
86
|
+
|
87
|
+
Your `experiments/my_condition/config.yaml` looks like this:
|
88
|
+
|
89
|
+
experiment:
|
90
|
+
development:
|
91
|
+
alpha: 0.5
|
92
|
+
compute:
|
93
|
+
alpha: 0.6
|
94
|
+
|
95
|
+
And you run the experiment with
|
96
|
+
|
97
|
+
$ experiment run my_condition --env compute -o "master_dir: /Users/kubowo/Desktop/points-vals/s015"
|
98
|
+
|
99
|
+
Then your final config will look like this:
|
100
|
+
|
101
|
+
{ :ref_dir => "/afs/group/DB/points",
|
102
|
+
:master_dir => "/Users/kubowo/Desktop/points-vals/s015",
|
103
|
+
:alpha => 0.6 }
|
104
|
+
|
105
|
+
Flexible, eh?
|
106
|
+
|
107
|
+
== Cross Validation
|
108
|
+
|
109
|
+
Cross validation (CV) is one of the most crucial research methods in CS and AI. For that reason it is built right in. You specify how many CVs you want to run using the --cv flag and your data is automatically split up for you and the experiment is run for each CV with the appropriate data.
|
110
|
+
|
111
|
+
== Reporting Results
|
112
|
+
|
113
|
+
$ experiment report
|
114
|
+
|
115
|
+
Surprise, surprise. This will create two files in your `report` directory (BTW, this directory is also meant for you to store your report or paper draft). The first is methods.mmd. This takes all the stuff you wrote in the beginnings of your experimental condition files and creates a multi-markdown (http://fletcherpenney.net/multimarkdown/) file out of them (I chose multi-markdown for it's LaTEX support and also it is directly importable into Scrivener, my writing application of choice, available at http://www.literatureandlatte.com/scrivener.html).
|
116
|
+
|
117
|
+
The second file created is the `data.csv` file which contains the data from all your experiments. It should be importable to Numbers, Excel even Matlab for further analysis and charting.
|
118
|
+
|
119
|
+
== Misc
|
120
|
+
|
121
|
+
So that's pretty much the gist of experiment. There's a few other features (and a few soon to come to a gem near you ;-)
|
122
|
+
|
123
|
+
Also check out the RDocs: http://rdoc.info/github/gampleman/Experiment/master/frames
|
data/Rakefile
ADDED
@@ -0,0 +1,26 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
gem 'hoe', '>= 2.1.0'
|
3
|
+
require 'hoe'
|
4
|
+
require 'fileutils'
|
5
|
+
require './lib/experiment'
|
6
|
+
|
7
|
+
Hoe.plugin :newgem
|
8
|
+
# Hoe.plugin :website
|
9
|
+
# Hoe.plugin :cucumberfeatures
|
10
|
+
|
11
|
+
# Generate all the Rake tasks
|
12
|
+
# Run 'rake -T' to see list of generated tasks (from gem root directory)
|
13
|
+
$hoe = Hoe.spec 'experiment' do
|
14
|
+
self.developer 'Jakub Hampl', 'honitom@seznam.cz'
|
15
|
+
#self.post_install_message = 'PostInstall.txt' # TODO remove if post-install message not required
|
16
|
+
self.rubyforge_name = self.name # TODO this is default value
|
17
|
+
# self.extra_deps = [['activesupport','>= 2.0.2']]
|
18
|
+
|
19
|
+
end
|
20
|
+
|
21
|
+
require 'newgem/tasks'
|
22
|
+
Dir['tasks/**/*.rake'].each { |t| load t }
|
23
|
+
|
24
|
+
# TODO - want other tests/tasks run by default? Add them to the list
|
25
|
+
# remove_task :default
|
26
|
+
# task :default => [:spec, :features]
|
data/bin/experiment
ADDED
@@ -0,0 +1,277 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
# == Synopsis
|
4
|
+
# This program will run an experimental batch or generate files
|
5
|
+
# for a new experiment
|
6
|
+
#
|
7
|
+
# == Examples
|
8
|
+
# Running an experiment
|
9
|
+
# experiment --env dice experiment1 experiment2 ...
|
10
|
+
#
|
11
|
+
# Generating a new experiment with 2 cross validations
|
12
|
+
# experiment new experiment_name --cv 2
|
13
|
+
#
|
14
|
+
# List all available experiments
|
15
|
+
# experiment list
|
16
|
+
#
|
17
|
+
#
|
18
|
+
# == Usage
|
19
|
+
# experiment command [options]
|
20
|
+
#
|
21
|
+
# For help use: experiment -h
|
22
|
+
#
|
23
|
+
# == Options
|
24
|
+
# -h, --help Displays help message
|
25
|
+
# -v, --version Display the version, then exit
|
26
|
+
# -q, --quiet Output as little as possible, overrides verbose
|
27
|
+
# -V, --verbose Verbose output
|
28
|
+
# -e, --env Sets the environment to run in
|
29
|
+
# Defaults to development
|
30
|
+
# -c, --cv Number of Cross validations to run
|
31
|
+
# -m, --description A description of the current experiment
|
32
|
+
#
|
33
|
+
# == Author
|
34
|
+
# Jakub Hampl
|
35
|
+
#
|
36
|
+
|
37
|
+
#require "rubygems"
|
38
|
+
require 'optparse'
|
39
|
+
#require 'rdoc/usage'
|
40
|
+
require 'ostruct'
|
41
|
+
#require File.dirname(__FILE__) + "/experiment"
|
42
|
+
|
43
|
+
class App
|
44
|
+
VERSION = '1.0'
|
45
|
+
|
46
|
+
attr_reader :options
|
47
|
+
|
48
|
+
def initialize(arguments, stdin)
|
49
|
+
@arguments = arguments
|
50
|
+
@stdin = stdin
|
51
|
+
|
52
|
+
# Set defaults
|
53
|
+
@options = OpenStruct.new
|
54
|
+
@options.verbose = false
|
55
|
+
@options.quiet = false
|
56
|
+
@options.env = :development
|
57
|
+
@options.cv = 5
|
58
|
+
@options.n_classes = 10
|
59
|
+
@options.kind = "d"
|
60
|
+
@options.description = ""
|
61
|
+
@options.opts = {}
|
62
|
+
|
63
|
+
end
|
64
|
+
|
65
|
+
# Parse options, check arguments, then process the command
|
66
|
+
def run
|
67
|
+
|
68
|
+
if parsed_options? && arguments_valid?
|
69
|
+
|
70
|
+
puts "Start at #{DateTime.now}\n\n" if @options.verbose
|
71
|
+
|
72
|
+
output_options if @options.verbose # [Optional]
|
73
|
+
require File.dirname(__FILE__) + "/vendor/backports/backports" if @options.env == :dice
|
74
|
+
|
75
|
+
process_arguments
|
76
|
+
process_command
|
77
|
+
|
78
|
+
puts "\nFinished at #{DateTime.now}" if @options.verbose
|
79
|
+
|
80
|
+
else
|
81
|
+
output_usage
|
82
|
+
end
|
83
|
+
|
84
|
+
end
|
85
|
+
|
86
|
+
protected
|
87
|
+
|
88
|
+
def parsed_options?
|
89
|
+
|
90
|
+
# Specify options
|
91
|
+
opts = OptionParser.new
|
92
|
+
opts.on('-v', '--version') { output_version ; exit 0 }
|
93
|
+
opts.on('-h', '--help') { output_help }
|
94
|
+
opts.on('-V', '--verbose') { @options.verbose = true }
|
95
|
+
opts.on('-q', '--quiet') { @options.quiet = true }
|
96
|
+
opts.on('-e', '--env [ENV]', [:development, :dice]) { |v| @options.env = v }
|
97
|
+
opts.on('-c', '--cv CV', Integer) { |v| @options.cv = v }
|
98
|
+
opts.on('-n', '--number NUMBER', Integer) { |v| @options.n_classes = v }
|
99
|
+
opts.on('-k', '--kind KIND', String) { |v| @options.kind = v }
|
100
|
+
opts.on('-m', '--description M', String) { |v| @options.description = v }
|
101
|
+
opts.on('-o', '--options OPTSTRING', String) do |v|
|
102
|
+
@options.opts = v
|
103
|
+
end
|
104
|
+
opts.parse!(@arguments) rescue return false
|
105
|
+
|
106
|
+
process_options
|
107
|
+
true
|
108
|
+
end
|
109
|
+
|
110
|
+
# Performs post-parse processing on options
|
111
|
+
def process_options
|
112
|
+
@options.verbose = false if @options.quiet
|
113
|
+
end
|
114
|
+
|
115
|
+
def output_options
|
116
|
+
puts "Options:\n"
|
117
|
+
|
118
|
+
@options.marshal_dump.each do |name, val|
|
119
|
+
puts " #{name} = #{val}"
|
120
|
+
end
|
121
|
+
end
|
122
|
+
|
123
|
+
# True if required arguments were provided
|
124
|
+
def arguments_valid?
|
125
|
+
true if @arguments.length > 0
|
126
|
+
end
|
127
|
+
|
128
|
+
# Setup the arguments
|
129
|
+
def process_arguments
|
130
|
+
# TO DO - place in local vars, etc
|
131
|
+
end
|
132
|
+
|
133
|
+
def output_help
|
134
|
+
output_version
|
135
|
+
RDoc::usage() #exits app
|
136
|
+
end
|
137
|
+
|
138
|
+
def output_usage
|
139
|
+
RDoc::usage('usage') # gets usage from comments above
|
140
|
+
end
|
141
|
+
|
142
|
+
def output_version
|
143
|
+
puts "#{File.basename(__FILE__)} version #{VERSION}"
|
144
|
+
end
|
145
|
+
|
146
|
+
def process_command
|
147
|
+
if @arguments.first == 'generate'
|
148
|
+
dir = "./experiments/" + @arguments[1]
|
149
|
+
Dir.mkdir(dir)
|
150
|
+
File.open(dir + "/" + @arguments[1] + ".rb", "w") do |req_file|
|
151
|
+
req_file.puts "# ## #{as_human_name @arguments[1]} ##"
|
152
|
+
req_file.puts "# "+@options.description.split("\n").join("\n# ")
|
153
|
+
req_file.puts
|
154
|
+
req_file.puts
|
155
|
+
req_file.puts "# The first contious block of comment will be included in your report."
|
156
|
+
req_file.puts "# This includes the reference implementation."
|
157
|
+
req_file.puts "# Override any desired files in this directory."
|
158
|
+
Dir["./app/**/*.rb"].each do |f|
|
159
|
+
p = f.split("/") - File.expand_path(".").split("/")
|
160
|
+
req_file.puts "require File.dirname(__FILE__) + \"/../../#{p.join("/")}\""
|
161
|
+
end
|
162
|
+
req_file.puts "\nclass #{as_class_name @arguments[1]} < MyExperiment\n\t\nend"
|
163
|
+
end
|
164
|
+
File.open(dir + "/config.yaml", "w") do |f|
|
165
|
+
f << "---\nexperiment:\n development:\n compute:\n"
|
166
|
+
end
|
167
|
+
|
168
|
+
elsif @arguments.first == "new" # generate a new project
|
169
|
+
require 'fileutils'
|
170
|
+
dir = "./" + @arguments[1]
|
171
|
+
Dir.mkdir(dir)
|
172
|
+
%w[app config experiments report results test tmp vendor].each do |d|
|
173
|
+
Dir.mkdir(dir + "/" + d)
|
174
|
+
end
|
175
|
+
basedir = File.dirname(__FILE__) + "/.."
|
176
|
+
File.open(File.join(dir, "config", "config.yaml"), "w") do |f|
|
177
|
+
f << "---\nenvironments:\n development:\n compute:\n"
|
178
|
+
end
|
179
|
+
File.open(File.join(dir, ".gitignore"), "w") do |f|
|
180
|
+
f << "tmp/*"
|
181
|
+
end
|
182
|
+
FileUtils::cp File.join(basedir, "lib/experiment/generator/readme_template.txt"), File.join(dir, "README")
|
183
|
+
FileUtils::cp File.join(basedir, "lib/experiment/generator/Rakefile"), File.join(dir, "Rakefile")
|
184
|
+
FileUtils::cp File.join(basedir, "lib/experiment/generator/experiment_template.rb"), File.join(dir, "experiments", "experiment.rb")
|
185
|
+
elsif @arguments.first == "list"
|
186
|
+
puts "Available experiments:"
|
187
|
+
puts " " + Dir["./experiments/*"].map{|a| File.basename(a) }.join(", ")
|
188
|
+
elsif @arguments.first == "report"
|
189
|
+
dir = "./report/"
|
190
|
+
File.open(dir + "method.mmd", "w") do |f|
|
191
|
+
f.puts "# Methods #"
|
192
|
+
Dir["./experiments/*/*.rb"].each do |desc|
|
193
|
+
if File.basename(desc) == File.basename(File.dirname(desc)) + ".rb"
|
194
|
+
File.read(desc).split("\n").each do |line|
|
195
|
+
if m = line.match(/^\# (.+)/)
|
196
|
+
f.puts m[1]
|
197
|
+
else
|
198
|
+
break
|
199
|
+
end
|
200
|
+
end
|
201
|
+
f.puts
|
202
|
+
f.puts
|
203
|
+
end
|
204
|
+
end
|
205
|
+
end
|
206
|
+
require 'csv'
|
207
|
+
require "yaml"
|
208
|
+
require File.dirname(__FILE__)+"/../lib/experiment/stats"
|
209
|
+
CSV.open(dir + "/data.csv", "w") do |csv|
|
210
|
+
data = {}
|
211
|
+
Dir["./results/*/results.yaml"].each do |res|
|
212
|
+
d = YAML::load_file(res)
|
213
|
+
da = {}
|
214
|
+
d.each do |k, vals|
|
215
|
+
da[k.to_s + " mean"], da[k.to_s + " sd"] = Stats::mean(vals), Stats::standard_deviation(vals)
|
216
|
+
vals.each_with_index do |v, i|
|
217
|
+
da[k.to_s + " cv:" + i.to_s] = v
|
218
|
+
end
|
219
|
+
end
|
220
|
+
array_merge(data, da)
|
221
|
+
end
|
222
|
+
data.keys.map do |key|
|
223
|
+
# calculate stats
|
224
|
+
a = data[key]
|
225
|
+
[key] + a
|
226
|
+
end.transpose.each do |row|
|
227
|
+
csv << row
|
228
|
+
end
|
229
|
+
end
|
230
|
+
elsif @arguments.shift == "run"
|
231
|
+
require File.dirname(__FILE__) + "/../lib/experiment/base"
|
232
|
+
require "experiments/experiment"
|
233
|
+
@arguments.each do |exp|
|
234
|
+
require "./experiments/#{exp}/#{exp}"
|
235
|
+
cla = eval(as_class_name(exp))
|
236
|
+
experiment = cla.new exp, @options.opts, @options.env
|
237
|
+
experiment.run! @options.cv
|
238
|
+
end
|
239
|
+
else
|
240
|
+
output_usage
|
241
|
+
end
|
242
|
+
|
243
|
+
end
|
244
|
+
|
245
|
+
|
246
|
+
private
|
247
|
+
def array_merge(h1, h2)
|
248
|
+
h2.each do |key, value|
|
249
|
+
h1[key] ||= []
|
250
|
+
h1[key] << value
|
251
|
+
end
|
252
|
+
end
|
253
|
+
|
254
|
+
def as_class_name(str)
|
255
|
+
str.split(/[\_\-]+/).map(&:capitalize).join
|
256
|
+
end
|
257
|
+
|
258
|
+
def as_human_name(str)
|
259
|
+
str.split(/[\_\-]+/).map(&:capitalize).join(" ")
|
260
|
+
end
|
261
|
+
|
262
|
+
def process_standard_input
|
263
|
+
input = @stdin.read
|
264
|
+
# TO DO - process input
|
265
|
+
|
266
|
+
# [Optional]
|
267
|
+
# @stdin.each do |line|
|
268
|
+
# # TO DO - process each line
|
269
|
+
#end
|
270
|
+
|
271
|
+
end
|
272
|
+
end
|
273
|
+
|
274
|
+
|
275
|
+
# Create and run the application
|
276
|
+
app = App.new(ARGV, STDIN)
|
277
|
+
app.run
|
data/lib/experiment.rb
ADDED
@@ -0,0 +1,162 @@
|
|
1
|
+
require File.dirname(__FILE__) + "/notify"
|
2
|
+
require File.dirname(__FILE__) + "/stats"
|
3
|
+
require File.dirname(__FILE__) + "/config"
|
4
|
+
require 'benchmark'
|
5
|
+
|
6
|
+
module Experiment
|
7
|
+
class Base
|
8
|
+
attr_reader :dir, :current_cv, :cvs
|
9
|
+
|
10
|
+
def initialize(experiment, options, env)
|
11
|
+
@experiment = experiment
|
12
|
+
|
13
|
+
Experiment::Config::load(experiment, options, env)
|
14
|
+
require "./experiments/#{experiment}/#{experiment}"
|
15
|
+
@abm = []
|
16
|
+
end
|
17
|
+
|
18
|
+
# runs the whole experiment
|
19
|
+
def run!(cv)
|
20
|
+
@cvs = cv || 1
|
21
|
+
@results = {}
|
22
|
+
Notify.print "Running #{@experiment} "
|
23
|
+
split_up_data
|
24
|
+
write_dir!
|
25
|
+
specification!
|
26
|
+
|
27
|
+
@cvs.times do |cv_num|
|
28
|
+
@bm = []
|
29
|
+
@current_cv = cv_num
|
30
|
+
File.open(@dir + "/raw-#{cv_num}.txt", "w") do |output|
|
31
|
+
run_the_experiment(@data[cv_num], output)
|
32
|
+
end
|
33
|
+
array_merge @results, analyze_result!(@dir + "/raw-#{cv_num}.txt", @dir + "/analyzed-#{cv_num}.txt")
|
34
|
+
write_performance!
|
35
|
+
Notify.print "."
|
36
|
+
end
|
37
|
+
summarize_performance!
|
38
|
+
summarize_results! @results
|
39
|
+
Notify.print result_line
|
40
|
+
end
|
41
|
+
|
42
|
+
|
43
|
+
# Registers and performs a benchmark which is then
|
44
|
+
# calculated to the total and everage times
|
45
|
+
def benchmark(label = "", &block)
|
46
|
+
@bm << Benchmark.measure("CV #{@current_cv} #{label}", &block)
|
47
|
+
end
|
48
|
+
|
49
|
+
|
50
|
+
# Creates the results directory for the current experiment
|
51
|
+
def write_dir!
|
52
|
+
@dir = "./results/#{@experiment}-cv#{@cvs}-#{Time.now.to_i.to_s[4..9]}"
|
53
|
+
Dir.mkdir @dir
|
54
|
+
end
|
55
|
+
|
56
|
+
# Writes a yaml specification of all the options used to run the experiment
|
57
|
+
def specification!
|
58
|
+
File.open(@dir + '/specification.yaml', 'w' ) do |out|
|
59
|
+
YAML.dump({:name => @experiment, :date => Time.now, :configuration => Experiment::Config.to_h, :cross_validations => @cvs}, out )
|
60
|
+
end
|
61
|
+
end
|
62
|
+
|
63
|
+
|
64
|
+
|
65
|
+
# Writes a file called 'performance_table.txt' which
|
66
|
+
# details all the benchmarks performed
|
67
|
+
def write_performance!
|
68
|
+
performance_f do |f|
|
69
|
+
f << "Cross Validation #{@current_cv} " + Benchmark::CAPTION
|
70
|
+
f << @bm.map {|m| m.format("%19n "+Benchmark::FMTSTR)}.join
|
71
|
+
total = @bm.reduce(0) {|t, m| m + t}
|
72
|
+
f << total.format(" Total: "+Benchmark::FMTSTR)
|
73
|
+
@abm << total
|
74
|
+
end
|
75
|
+
end
|
76
|
+
|
77
|
+
# Calculates the average performance and writes it
|
78
|
+
def summarize_performance!
|
79
|
+
performance_f do |f|
|
80
|
+
total = @abm.reduce(0) {|t, m| m + t} / @abm.count
|
81
|
+
f << total.format(" Average: "+Benchmark::FMTSTR)
|
82
|
+
end
|
83
|
+
end
|
84
|
+
|
85
|
+
# creates a summary of the results and writes to 'all.csv'
|
86
|
+
def summarize_results!(results)
|
87
|
+
File.open(@dir + '/results.yaml', 'w' ) do |out|
|
88
|
+
YAML.dump(results, out )
|
89
|
+
end
|
90
|
+
|
91
|
+
# create an array of arrays
|
92
|
+
res = results.keys.map do |key|
|
93
|
+
# calculate stats
|
94
|
+
a = results[key]
|
95
|
+
[key] + a + [Stats::mean(a), Stats::standard_deviation(a)]
|
96
|
+
end
|
97
|
+
|
98
|
+
ls = results.keys.map{|v| v.to_s.length }
|
99
|
+
|
100
|
+
ls = ["Standard Deviation".length] + ls
|
101
|
+
res = [["cv"] + (1..cvs).to_a.map(&:to_s) + ["Mean", "Standard Deviation"]] + res
|
102
|
+
|
103
|
+
out = ""
|
104
|
+
res.transpose.each do |col|
|
105
|
+
col.each_with_index do |cell, i|
|
106
|
+
l = ls[i]
|
107
|
+
out << "| "
|
108
|
+
if cell.is_a?(String) || cell.is_a?(Symbol)
|
109
|
+
out << sprintf("%#{l}s", cell)
|
110
|
+
else
|
111
|
+
out << sprintf("%#{l}.3f", cell)
|
112
|
+
end
|
113
|
+
out << " "
|
114
|
+
end
|
115
|
+
out << "|\n"
|
116
|
+
end
|
117
|
+
File.open(@dir + "/summary.mmd", 'w') do |f|
|
118
|
+
f << "## Results for #{@experiment} ##\n\n"
|
119
|
+
f << out
|
120
|
+
end
|
121
|
+
#results = results.reduce({}) do |tot, res|
|
122
|
+
# cv = res.delete :cv
|
123
|
+
# tot.merge Hash[res.to_a.map {|a| ["cv_#{cv}_#{a.first}".to_sym, a.last]}]
|
124
|
+
#end
|
125
|
+
#FasterCSV.open("./results/all.csv", "a") do |csv|
|
126
|
+
# csv << results.to_a.sort_by{|a| a.first.to_s}.map(&:last)
|
127
|
+
#end
|
128
|
+
end
|
129
|
+
|
130
|
+
def result_line
|
131
|
+
" Done\n"
|
132
|
+
end
|
133
|
+
|
134
|
+
# A silly method meant to be overriden.
|
135
|
+
# should return an array, which will be then split up for cross-validating
|
136
|
+
def test_data
|
137
|
+
(1..cvs).to_a
|
138
|
+
end
|
139
|
+
|
140
|
+
def split_up_data
|
141
|
+
@data = []
|
142
|
+
test_data.each_with_index do |item, i|
|
143
|
+
@data[i % cvs] ||= []
|
144
|
+
@data[i % cvs] << item
|
145
|
+
end
|
146
|
+
@data
|
147
|
+
end
|
148
|
+
|
149
|
+
private
|
150
|
+
# Yields a handle to the performance table
|
151
|
+
def performance_f(&block) # just a simple wrapper to make code a little DRYer
|
152
|
+
File.open(@dir+"/performance_table.txt", "a", &block)
|
153
|
+
end
|
154
|
+
|
155
|
+
def array_merge(h1, h2)
|
156
|
+
h2.each do |key, value|
|
157
|
+
h1[key] ||= []
|
158
|
+
h1[key] << value
|
159
|
+
end
|
160
|
+
end
|
161
|
+
end
|
162
|
+
end
|
@@ -0,0 +1,40 @@
|
|
1
|
+
require "yaml"
|
2
|
+
|
3
|
+
module Experiment
|
4
|
+
class Config
|
5
|
+
class << self
|
6
|
+
|
7
|
+
# the load method takes the basic config file, which is then
|
8
|
+
# overriden by the experimental config file and finally by
|
9
|
+
# the options string (which should be in this format:
|
10
|
+
# "key: value, key2:value2,key3: value3")
|
11
|
+
def load(experiment, options, env = :development)
|
12
|
+
init env
|
13
|
+
expath = File.expand_path("./experiments/#{experiment}/config.yaml")
|
14
|
+
if File.exists? expath
|
15
|
+
exp = YAML::load_file(expath)
|
16
|
+
@config.merge! exp["experiment"][env.to_s] if exp["experiment"][env.to_s].is_a? Hash
|
17
|
+
end
|
18
|
+
@config.merge! parse(options)
|
19
|
+
end
|
20
|
+
|
21
|
+
def init(env = :development)
|
22
|
+
conf = YAML::load_file("./config/config.yaml")
|
23
|
+
@config = conf["environments"][env.to_s]
|
24
|
+
end
|
25
|
+
|
26
|
+
def [](v)
|
27
|
+
@config[v.to_s]
|
28
|
+
end
|
29
|
+
|
30
|
+
def parse(options)
|
31
|
+
Hash[options.split(/\, ?/).map{|a| a.split /\: ?/ }]
|
32
|
+
end
|
33
|
+
|
34
|
+
def to_h
|
35
|
+
@config
|
36
|
+
end
|
37
|
+
end
|
38
|
+
|
39
|
+
end
|
40
|
+
end
|
@@ -0,0 +1,17 @@
|
|
1
|
+
require 'rake'
|
2
|
+
require 'rake/testtask'
|
3
|
+
require "rake/rdoctask"
|
4
|
+
|
5
|
+
task :default => [:test]
|
6
|
+
|
7
|
+
Rake::TestTask.new do |t|
|
8
|
+
t.libs += ["app/input", "app/dtw", "app/models", "."]
|
9
|
+
t.test_files = FileList['test/unit/*.rb']
|
10
|
+
t.verbose = true
|
11
|
+
end
|
12
|
+
|
13
|
+
Rake::RDocTask.new do |rd|
|
14
|
+
rd.main = "README"
|
15
|
+
rd.rdoc_files.include("README", "app/**/*.rb")
|
16
|
+
rd.rdoc_dir = "doc"
|
17
|
+
end
|
@@ -0,0 +1,35 @@
|
|
1
|
+
class MyExperiment < Experiment::Base
|
2
|
+
|
3
|
+
def test_data
|
4
|
+
# TODO: Specify an array of all the test data.
|
5
|
+
# It will be split up automatically for you accross Cross-validations
|
6
|
+
end
|
7
|
+
|
8
|
+
def run_the_experiment(data, output)
|
9
|
+
# TODO: Define how you will run the experiment
|
10
|
+
# Remeber, each seperate experiment inherits from this base class and includes
|
11
|
+
# it's own files, so this should be a rather generic implementation
|
12
|
+
|
13
|
+
# 1. prepare any nessecary setup like I/O lists, etc...
|
14
|
+
|
15
|
+
# 2. do the experiment
|
16
|
+
benchmark do
|
17
|
+
output << # run your code here
|
18
|
+
end
|
19
|
+
|
20
|
+
# 3. clean up
|
21
|
+
|
22
|
+
end
|
23
|
+
|
24
|
+
def analyze_result!(input, output)
|
25
|
+
# TODO perform an analysis of what your program did
|
26
|
+
|
27
|
+
# remember to return a hash of meaningful data, best of all a summary
|
28
|
+
end
|
29
|
+
|
30
|
+
# you might want to override this method as well:
|
31
|
+
# def summarize_results!(results)
|
32
|
+
# super(results)
|
33
|
+
# end
|
34
|
+
|
35
|
+
end
|
@@ -0,0 +1,24 @@
|
|
1
|
+
This project has a slightly complex structure which is though beneficial for automating
|
2
|
+
many mundane tasks. So a brief description of the files/folders included:
|
3
|
+
|
4
|
+
app
|
5
|
+
- Contains the base implementation.
|
6
|
+
config
|
7
|
+
- Configuration for running in different environments.
|
8
|
+
doc
|
9
|
+
- Documentation for the reference implementation.
|
10
|
+
experiments
|
11
|
+
- This directory includes all experiments that were coded. They generally `require`
|
12
|
+
files from the reference implementation and add modifications of there own.
|
13
|
+
Each is explained in its `about.md` file.
|
14
|
+
report
|
15
|
+
- Source files used to create the report (multi-markdown format, see http://fletcherpenney.net/multimarkdown).
|
16
|
+
results
|
17
|
+
- Has all the measurements from individual experiments. Naming convention:
|
18
|
+
{name}-{classes}-cv{number of cross validations}-{shortened timestamp}.
|
19
|
+
test
|
20
|
+
- Unit tests.
|
21
|
+
tmp
|
22
|
+
- Various stuff stored here. Ignore.
|
23
|
+
vendor
|
24
|
+
- Library code and scripts not written by me.
|
@@ -0,0 +1,43 @@
|
|
1
|
+
class Stats
|
2
|
+
class << self
|
3
|
+
|
4
|
+
def sum(ar, &block)
|
5
|
+
ar.reduce(0.0) {|asum, a| (block_given? ? yield(a) : a) + asum}
|
6
|
+
end
|
7
|
+
|
8
|
+
def variance(ar)
|
9
|
+
v = sum(ar) {|x| (mean(ar) - x)**2.0 }
|
10
|
+
v/(ar.count - 1.0)
|
11
|
+
end
|
12
|
+
|
13
|
+
def standard_deviation(ar)
|
14
|
+
Math.sqrt(variance(ar))
|
15
|
+
end
|
16
|
+
|
17
|
+
def z_scores(ar)
|
18
|
+
ar.map {|x| z_score(ar, x)}
|
19
|
+
end
|
20
|
+
|
21
|
+
def z_score(ar, x)
|
22
|
+
(x - mean(ar)) / standard_deviation(ar)
|
23
|
+
end
|
24
|
+
|
25
|
+
def range(ar)
|
26
|
+
ar.max - ar.min
|
27
|
+
end
|
28
|
+
|
29
|
+
def mean(ar)
|
30
|
+
sum(ar) / ar.count
|
31
|
+
end
|
32
|
+
|
33
|
+
def median(ar)
|
34
|
+
a = ar.sort
|
35
|
+
if ar.count.odd?
|
36
|
+
a[(ar.count-1)/2]
|
37
|
+
else
|
38
|
+
(a[ar.count/2 - 1] + a[ar.count/2]) / 2.0
|
39
|
+
end
|
40
|
+
end
|
41
|
+
|
42
|
+
end
|
43
|
+
end
|
data/test/test_helper.rb
ADDED
metadata
ADDED
@@ -0,0 +1,112 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: experiment
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
prerelease: false
|
5
|
+
segments:
|
6
|
+
- 0
|
7
|
+
- 0
|
8
|
+
- 1
|
9
|
+
version: 0.0.1
|
10
|
+
platform: ruby
|
11
|
+
authors:
|
12
|
+
- Jakub Hampl
|
13
|
+
autorequire:
|
14
|
+
bindir: bin
|
15
|
+
cert_chain: []
|
16
|
+
|
17
|
+
date: 2010-10-31 01:00:00 +01:00
|
18
|
+
default_executable:
|
19
|
+
dependencies:
|
20
|
+
- !ruby/object:Gem::Dependency
|
21
|
+
name: rubyforge
|
22
|
+
prerelease: false
|
23
|
+
requirement: &id001 !ruby/object:Gem::Requirement
|
24
|
+
none: false
|
25
|
+
requirements:
|
26
|
+
- - ">="
|
27
|
+
- !ruby/object:Gem::Version
|
28
|
+
segments:
|
29
|
+
- 2
|
30
|
+
- 0
|
31
|
+
- 4
|
32
|
+
version: 2.0.4
|
33
|
+
type: :development
|
34
|
+
version_requirements: *id001
|
35
|
+
- !ruby/object:Gem::Dependency
|
36
|
+
name: hoe
|
37
|
+
prerelease: false
|
38
|
+
requirement: &id002 !ruby/object:Gem::Requirement
|
39
|
+
none: false
|
40
|
+
requirements:
|
41
|
+
- - ">="
|
42
|
+
- !ruby/object:Gem::Version
|
43
|
+
segments:
|
44
|
+
- 2
|
45
|
+
- 6
|
46
|
+
- 2
|
47
|
+
version: 2.6.2
|
48
|
+
type: :development
|
49
|
+
version_requirements: *id002
|
50
|
+
description: ""
|
51
|
+
email:
|
52
|
+
- honitom@seznam.cz
|
53
|
+
executables:
|
54
|
+
- experiment
|
55
|
+
extensions: []
|
56
|
+
|
57
|
+
extra_rdoc_files:
|
58
|
+
- History.txt
|
59
|
+
- Manifest.txt
|
60
|
+
- lib/experiment/generator/readme_template.txt
|
61
|
+
files:
|
62
|
+
- History.txt
|
63
|
+
- Manifest.txt
|
64
|
+
- README.rdoc
|
65
|
+
- Rakefile
|
66
|
+
- lib/experiment.rb
|
67
|
+
- lib/experiment/config.rb
|
68
|
+
- lib/experiment/stats.rb
|
69
|
+
- lib/experiment/generator/readme_template.txt
|
70
|
+
- lib/experiment/generator/experiment_template.rb
|
71
|
+
- lib/experiment/generator/Rakefile
|
72
|
+
- lib/experiment/base.rb
|
73
|
+
- lib/experiment/notify.rb
|
74
|
+
- test/test_experiment.rb
|
75
|
+
- test/test_helper.rb
|
76
|
+
- bin/experiment
|
77
|
+
has_rdoc: true
|
78
|
+
homepage: http://github.com/gampleman/experiment
|
79
|
+
licenses: []
|
80
|
+
|
81
|
+
post_install_message:
|
82
|
+
rdoc_options:
|
83
|
+
- --main
|
84
|
+
- README.rdoc
|
85
|
+
require_paths:
|
86
|
+
- lib
|
87
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
88
|
+
none: false
|
89
|
+
requirements:
|
90
|
+
- - ">="
|
91
|
+
- !ruby/object:Gem::Version
|
92
|
+
segments:
|
93
|
+
- 0
|
94
|
+
version: "0"
|
95
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
96
|
+
none: false
|
97
|
+
requirements:
|
98
|
+
- - ">="
|
99
|
+
- !ruby/object:Gem::Version
|
100
|
+
segments:
|
101
|
+
- 0
|
102
|
+
version: "0"
|
103
|
+
requirements: []
|
104
|
+
|
105
|
+
rubyforge_project: experiment
|
106
|
+
rubygems_version: 1.3.7
|
107
|
+
signing_key:
|
108
|
+
specification_version: 3
|
109
|
+
summary: ""
|
110
|
+
test_files:
|
111
|
+
- test/test_experiment.rb
|
112
|
+
- test/test_helper.rb
|