datafarming 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 5aa6e5a57e29e45261e3322740ae3694c0c5eb7e32f841b71a15fb4da631d2d1
4
+ data.tar.gz: c7d299e10278175d1a87020f9e84bc91d697754ab981c1af07174d25c8467e9a
5
+ SHA512:
6
+ metadata.gz: fffc9ed4734f4d1a8ee9ebd18bacbe7f9f679ade49b8910e61ac751de2adb8f7ec5bd2475b7572597d31307f8ab08d425fb2db176438527a4c68264d0e8fdbd6
7
+ data.tar.gz: 8bed8adb0b6d0e1f1ad19ad5b3e63517d590ce07e8128e18c4eb24d9545b3c03ac68fcd5bd26ec497f643995c4b31563ed310162c5df38da22669ea1405376d8
data/README.md ADDED
@@ -0,0 +1,49 @@
1
+ ### SETUP
2
+
3
+ This gem requires Ruby v2.3 or later to be installed on your system.
4
+ - On MacOS, Ruby comes pre-installed. Note: You can use the homebrew package manager from [brew.sh](https://brew.sh) if you want to install a newer version of Ruby which is noticeably faster than the one that currently ships with MacOS.
5
+ - Windows users should use one of the installer packages available from [rubyinstaller.org](http://rubyinstaller.org/). You will need to determine whether your have a 32 or 64 bit processor, and download the appropriate installer. We recommend Ruby v2.5.x or later, as it is substantially faster than prior versions. When running the installer, check the checkboxes to add the Ruby installation to your PATH and to associate `.rb` and `.rbw` files with Ruby. A terminal window will pop up during the installation—press "enter" to automatically install all of the required tools, and when prompted again (several minutes later) press "enter" again to end the installation.
6
+
7
+ You can check that Ruby is properly installed by opening `Terminal.app` (MacOS) or `CMD.EXE` (Windows) and typing `ruby --version` (note: two dashes). You should see a response telling you which version of Ruby you are running.
8
+
9
+ After your Ruby installation is confirmed you can install the data farming Ruby scripts by running the following command.
10
+
11
+ gem install datafarming
12
+
13
+ This assumes that you have a network connection. Additional dependencies will be installed automatically. On MacOS or Linux systems, you may be required to use the `sudo` command to to authenticate the installation. In that case, type `sudo gem install datafarming` and enter your password when prompted.
14
+
15
+ Gem installation only needs to be done once. If you explicitly downloaded the `.gem` file rather than using a network installation, you can delete it after performing the installation.
16
+
17
+ ### USAGE
18
+
19
+ Ruby is a powerful and concise object-oriented scripting language. Scripts are normally run from a command-line or terminal environment by typing the `ruby` command followed by a script name, often followed by one or more command-line arguments. However, scripts installed as gems do not need the `ruby` command to be typed explicitly. For example, typing `stripheaderdups.rb my_file.txt` will invoke the `stripheaderdups.rb` script and apply it to file `my_file.txt` in your current working directory. Note that on Windows the `.rb` suffix is not needed to run your scripts, e.g., `stripheaderdups my_file.txt` will suffice.
20
+
21
+ All scripts in this distribution are self documenting if run with a `--help`, `-h`, or `-?` option. Quick descriptions follow.
22
+
23
+ ### GENERAL FILE MANIPULATION SCRIPTS:
24
+ - `blank2csv.rb` —
25
+ converts whitespace-delimited files to Comma Separated Values (CSV) files
26
+ - `cat.rb` —
27
+ Unix “cat” program work-alike for Windows—for viewing or creating text files or concatenating multiple files, based on stdio
28
+ - `convert_line_endings.rb` —
29
+ converts text files (including CSV files) to your current operating system environment (e.g., DOS to Unix or Unix to DOS)
30
+ - `csv2blank.rb` —
31
+ converts CSV files to whitespace-delimited files
32
+
33
+ ### DATA FARMING RUBY SCRIPTS:
34
+ - `batchrunner.rb` —
35
+ run a model interactively with replication
36
+ - `rundesign_general.rb` —
37
+ run control to replicate a designed experiment with a model
38
+ - `pool_files.rb` —
39
+ pool columns from separate files into a new file, useful for combining design file & output files or multiple output files
40
+ - `stack_nolhs.rb` —
41
+ generate designs by reassigning columns from built-in NOLHs
42
+ - `stripheaders.rb`, `stripheaderdups.rb` —
43
+ used to remove headers from a file, or duplicate headers within a file, respectively
44
+ - `augment_design.rb` —
45
+ generates star points to augment a fractional factorial
46
+ - `cross.rb` —
47
+ creates a combinatorial design by crossing all combinations of any # of individual smaller designs
48
+ - `mser.rb` —
49
+ uses MSER truncation to remove initial transient effects for time-series output, reports truncated average and number of observations for each run to facilitate construction of a properly weighted confidence interval.
@@ -0,0 +1,24 @@
1
+ # -*- ruby -*-
2
+ _VERSION = "1.0.0"
3
+
4
+ Gem::Specification.new do |s|
5
+ s.name = "datafarming"
6
+ s.version = _VERSION
7
+ s.date = "2018-06-25"
8
+ s.summary = "Useful scripts for data farming."
9
+ s.homepage = "https://gitlab.nps.edu/pjsanche/datafarmingrubyscripts.git"
10
+ s.email = "pjs@alum.mit.edu"
11
+ s.description = "Ruby scripts for data farming, including pre- and post-processing, design generation and scaling, and run control."
12
+ s.author = "Paul J Sanchez"
13
+ s.files = `git ls-files -z`.split("\x0").reject do |f|
14
+ f.match(%r{^(test/|features/|.gitignore)})
15
+ end
16
+ s.bindir = "exe"
17
+ s.executables = s.files.grep(%r{^exe/}) { |f| File.basename(f) }
18
+ s.require_paths = ["lib"]
19
+ s.add_runtime_dependency 'fwt', '~> 1.2'
20
+ s.add_runtime_dependency 'colorize', '~> 0.8'
21
+ s.add_runtime_dependency 'quickstats', '~> 2'
22
+ s.required_ruby_version = '>= 2.3'
23
+ s.license = 'MIT'
24
+ end
@@ -0,0 +1,66 @@
1
+ #!/usr/bin/env ruby -w
2
+
3
+ # This script generates star points to augment a fractional factorial
4
+ # so you can check for quadratic effects.
5
+
6
+ require 'colorize'
7
+
8
+ String.disable_colorization false
9
+
10
+ require 'optparse'
11
+ require 'datafarming/error_handling'
12
+
13
+ help_msg = [
14
+ 'Generate star points to augment a fractional factorial ' \
15
+ 'with quadratic effects.',
16
+ 'Results are white-space delimited data written to ' +
17
+ 'stdout'.light_blue + ', and can be redirected', 'as desired.', '',
18
+ 'Syntax:',
19
+ "\n\t#{ErrorHandling.prog_name} [--help] FACTOR_INFO".yellow, '',
20
+ "Arguments in square brackets are optional. A vertical bar '|'",
21
+ 'indicates valid alternatives for invoking the option. Prefix',
22
+ 'the command with "' + 'ruby'.yellow +
23
+ '" if it is not on your PATH.', '',
24
+ ' --help | -h | -? | ?'.green,
25
+ "\tProduce this help message.",
26
+ ' FACTOR_INFO'.green,
27
+ "\tEITHER the number of factors (produces standardized +/-1 design),",
28
+ "\tOR pairs of values " + 'low1 hi1 low2 hi2...lowN hiN'.green +
29
+ ' for each of the',
30
+ "\tN factors.", ''
31
+ ]
32
+
33
+ OptionParser.new do |opts|
34
+ opts.banner = "Usage: #{$PROGRAM_NAME} [-h|--help] [number of factors]"
35
+ opts.on('-h', '-?', '--help') { ErrorHandling.clean_abort help_msg }
36
+ end.parse!
37
+
38
+ if ARGV.length == 0 || (ARGV[0] == '?') || (ARGV.length > 1 && ARGV.length.odd?)
39
+ ErrorHandling.clean_abort help_msg
40
+ else
41
+ num_factors = 0
42
+ if ARGV.length == 1
43
+ num_factors = ARGV.shift.to_i
44
+ x = Array.new(num_factors, ' 0')
45
+ ranges = Array.new(2 * num_factors)
46
+ ranges.each_index { |i| ranges[i] = (i.even? ? '-1' : ' 1') }
47
+ else
48
+ num_factors = ARGV.length / 2
49
+ x = Array.new(num_factors)
50
+ ranges = ARGV
51
+ num_factors.times do |i|
52
+ x[i] = 0.5 * (ranges[2 * i].to_f + ranges[2 * i + 1].to_f)
53
+ end
54
+ end
55
+ # create and print an array at the global center point (0,0,...,0)
56
+ puts x.join(' ')
57
+
58
+ # now generate and print the star points at +/-1 along each factor axis
59
+ num_factors.times do |i|
60
+ y = x.clone
61
+ y[i] = ranges[2 * i]
62
+ puts y.join(' ')
63
+ y[i] = ranges[2 * i + 1]
64
+ puts y.join(' ')
65
+ end
66
+ end
@@ -0,0 +1,46 @@
1
+ #!/usr/bin/env ruby -w
2
+
3
+ require 'colorize'
4
+
5
+ String.disable_colorization false
6
+
7
+ require 'optparse'
8
+ require 'datafarming/error_handling'
9
+
10
+ help_msg = [
11
+ 'Run a model interactively with replication.', '',
12
+ 'Prompts the user interactively for a model/program to be run,',
13
+ 'parameter values to use on the model command-line, and desired',
14
+ 'number of replications. The model is run the specified number',
15
+ 'of times with the given arguments. Separate output files are',
16
+ 'created for each run of the model.', '',
17
+ 'Syntax:',
18
+ "\n\t#{ErrorHandling.prog_name} [--help] ".yellow +
19
+ '[--outfile fname]'.yellow, '',
20
+ "Arguments in square brackets are optional. A vertical bar '|'",
21
+ 'indicates valid alternatives for invoking the option. Prefix',
22
+ 'the command with "' + 'ruby'.yellow +
23
+ '" if it is not on your PATH.', '',
24
+ ' --help | -h | -? | ?'.green,
25
+ "\tProduce this help message.",
26
+ ' --outfile fname | -o fname'.green,
27
+ "\tSpecify a specific prefix for output file names, defaults to 'outfile'."
28
+ ]
29
+
30
+ outfile_name = 'outfile'
31
+ OptionParser.new do |opts|
32
+ opts.banner = "Usage: #{$PROGRAM_NAME} [OPTIONS]"
33
+ opts.on('-h', '-?', '--help') { ErrorHandling.clean_abort help_msg }
34
+ opts.on('-o', '--outfile fname') { |fname| outfile_name = fname }
35
+ end.parse!
36
+
37
+ ErrorHandling.clean_abort help_msg if ARGV[0] == '?'
38
+
39
+ cmd = (STDERR.print 'Enter command: '; gets.strip)
40
+ params = (STDERR.print 'Enter parameters: '; gets.strip)
41
+ num_runs = (STDERR.print 'Enter # runs: '; gets).to_i
42
+
43
+ (1..num_runs).each do |run|
44
+ STDERR.print "run #{run}:",
45
+ `#{cmd} #{params} > #{outfile_name}-#{'%05d' % run}.csv`, "\n"
46
+ end
data/exe/blank2csv.rb ADDED
@@ -0,0 +1,40 @@
1
+ #!/usr/bin/env ruby -W0
2
+ # converts blank separated files to csv
3
+
4
+ require 'colorize'
5
+
6
+ String.disable_colorization false
7
+
8
+ require 'optparse'
9
+ require 'datafarming/error_handling'
10
+
11
+ help_msg = [
12
+ 'Convert whitespace delimited data to comma separated values.', '',
13
+ 'If filenames are specified, a backup is made for each file with',
14
+ "suffix '.orig' appended to the original filename and changes will",
15
+ 'be made in-place in the original file. If no filenames are given,',
16
+ 'the script reads from ' + 'stdin'.blue + ' and writes to ' +
17
+ 'stdout'.blue + '. In either case,',
18
+ 'all occurrences of one or more whitespace characters are replaced',
19
+ 'by commas.', '',
20
+ 'Syntax:',
21
+ "\n\t#{ErrorHandling.prog_name} [--help] [filenames...]".yellow, '',
22
+ "Arguments in square brackets are optional. A vertical bar '|'",
23
+ 'indicates valid alternatives for invoking the option. Prefix',
24
+ 'the command with "' + 'ruby'.yellow +
25
+ '" if it is not on your PATH.', '',
26
+ ' --help | -h | -? | ?'.green,
27
+ "\tProduce this help message.",
28
+ ' filenames...'.green,
29
+ "\tThe name[s] of the file[s] to be converted."
30
+ ]
31
+
32
+ OptionParser.new do |opts|
33
+ opts.banner = "Usage: #{$PROGRAM_NAME} [-h|--help] [filenames...[]"
34
+ opts.on('-h', '-?', '--help') { ErrorHandling.clean_abort help_msg }
35
+ end.parse!
36
+
37
+ ErrorHandling.clean_abort help_msg if ARGV[0] == '?'
38
+
39
+ $-i = '.orig'
40
+ ARGF.each { |line| puts line.strip.gsub(/\s+/, ',') }
data/exe/cat.rb ADDED
@@ -0,0 +1,36 @@
1
+ #!/usr/bin/env ruby -w
2
+ # concatenate one or more inputs to stdout
3
+
4
+ require 'colorize'
5
+
6
+ String.disable_colorization false
7
+
8
+ require 'optparse'
9
+ require 'datafarming/error_handling'
10
+
11
+ help_msg = [
12
+ 'Concatenate one or more input files, or ' + 'stdin'.blue +
13
+ ', to ' + 'stdout'.blue + '.', '',
14
+ 'Syntax:',
15
+ "\n\t#{ErrorHandling.prog_name} [--help] [filenames...]".yellow, '',
16
+ "Arguments in square brackets are optional. A vertical bar '|'",
17
+ 'indicates valid alternatives for invoking the option. Prefix',
18
+ 'the command with "' + 'ruby'.yellow +
19
+ '" if it is not on your PATH.', '',
20
+ ' --help | -h | -? | ?'.green,
21
+ "\tProduce this help message.",
22
+ ' filenames...'.green,
23
+ "\tThe name[s] of the file[s] to be concatenated.",
24
+ "\tRead from " + 'stdin'.blue + ' if no files are specified.',
25
+ "\tTo terminate interactive input enter " + 'ctrl-d'.cyan,
26
+ "\t(Mac/Unix/Linux) or " + 'ctrl-z'.cyan + ' (Windows).'
27
+ ]
28
+
29
+ OptionParser.new do |opts|
30
+ opts.banner = "Usage: #{$PROGRAM_NAME} [-h|--help] [filenames...[]"
31
+ opts.on('-h', '-?', '--help') { ErrorHandling.clean_abort help_msg }
32
+ end.parse!
33
+
34
+ ErrorHandling.clean_abort help_msg if ARGV[0] == '?'
35
+
36
+ ARGF.each { |line| print line }
@@ -0,0 +1,38 @@
1
+ #!/usr/bin/env ruby -w
2
+ # Ruby script to convert end of line to current system default
3
+
4
+ require 'colorize'
5
+
6
+ String.disable_colorization false
7
+
8
+ require 'optparse'
9
+ require 'datafarming/error_handling'
10
+
11
+ help_msg = [
12
+ 'Convert DOS line endings to Unix/Mac OS X line endings or vice versa.', '',
13
+ 'If filenames are specified, a backup is made for each file with',
14
+ "suffix '.orig' appended to the original filename and changes will",
15
+ 'be made in-place in the original file. If no filenames are given,',
16
+ 'the script reads from ' + 'stdin'.blue + ' and writes to ' +
17
+ 'stdout'.blue + '.', '',
18
+ 'Syntax:',
19
+ "\n\t#{ErrorHandling.prog_name} [--help] [filenames...]".yellow, '',
20
+ "Arguments in square brackets are optional. A vertical bar '|'",
21
+ 'indicates valid alternatives for invoking the option. Prefix',
22
+ 'the command with "' + 'ruby'.yellow +
23
+ '" if it is not on your PATH.', '',
24
+ ' --help | -h | -? | ?'.green,
25
+ "\tProduce this help message.",
26
+ ' filenames...'.green,
27
+ "\tThe name[s] of the file[s] to be converted."
28
+ ]
29
+
30
+ OptionParser.new do |opts|
31
+ opts.banner = "Usage: #{$PROGRAM_NAME} [-h|--help] [filenames...[]"
32
+ opts.on('-h', '-?', '--help') { ErrorHandling.clean_abort help_msg }
33
+ end.parse!
34
+
35
+ ErrorHandling.clean_abort help_msg if ARGV[0] == '?'
36
+
37
+ $-i = '.orig'
38
+ ARGF.each { |line| puts line.split(/\r\n|\r/) }
data/exe/cross.rb ADDED
@@ -0,0 +1,44 @@
1
+ #!/usr/bin/env ruby -w
2
+
3
+ require 'colorize'
4
+
5
+ String.disable_colorization false
6
+
7
+ require 'optparse'
8
+ require 'datafarming/error_handling'
9
+ require 'datafarming/cross'
10
+
11
+ help_msg = [
12
+ 'Create a crossed design from two or more input design files',
13
+ 'where each line is a design point. The crossed design is',
14
+ 'written to ' + 'stdout'.blue + ' in CSV format.', '',
15
+ 'Syntax:',
16
+ "\n\t#{ErrorHandling.prog_name} [--help] filenames...".yellow, '',
17
+ "Arguments in square brackets are optional. A vertical bar '|'",
18
+ 'indicates valid alternatives for invoking the option. Prefix',
19
+ 'the command with "' + 'ruby'.yellow +
20
+ '" if it is not on your PATH.', '',
21
+ ' --help | -h | -? | ?'.green,
22
+ "\tProduce this help message.",
23
+ ' filenames...'.green,
24
+ "\tThe names of the files containing designs to be crossed.",
25
+ "\tInput file data can be delimited by commas, semicolons,",
26
+ "\tcolons, or whitespace."
27
+ ]
28
+
29
+ OptionParser.new do |opts|
30
+ opts.banner = "Usage: #{$PROGRAM_NAME} [-h|--help] [filenames...[]"
31
+ opts.on('-h', '-?', '--help') { ErrorHandling.clean_abort help_msg }
32
+ end.parse!
33
+
34
+ ErrorHandling.clean_abort help_msg if ARGV[0] == '?' || ARGV.length < 2
35
+
36
+ input_array = []
37
+ ARGV.each do |filename| # for each file given as a command-line arg...
38
+ # open the file, read all the lines, and then for each line use
39
+ # spaces, commas, colons, or semicolons to tokenize.
40
+ input_array << File.open(filename).readlines.map do |line|
41
+ line.strip.split(/[,:;]|\s+/)
42
+ end
43
+ end
44
+ CrossedDesigns.cross(input_array).each { |line| puts line.join(',') }
data/exe/csv2blank.rb ADDED
@@ -0,0 +1,39 @@
1
+ #!/usr/bin/env ruby -W0
2
+ # converts csv separated files to blank separated
3
+
4
+ require 'colorize'
5
+
6
+ String.disable_colorization false
7
+
8
+ require 'optparse'
9
+ require 'datafarming/error_handling'
10
+
11
+ help_msg = [
12
+ 'Convert comma separated values data to whitespace delimited.', '',
13
+ 'If filenames are specified, a backup is made for each file with',
14
+ "suffix '.orig' appended to the original filename and changes will",
15
+ 'be made in-place in the original file. If no filenames are given,',
16
+ 'the script reads from ' + 'stdin'.blue + ' and writes to ' +
17
+ 'stdout'.blue + '. In either case,',
18
+ 'all occurrences of commas are replaced by blanks.', '',
19
+ 'Syntax:',
20
+ "\n\t#{ErrorHandling.prog_name} [--help] [filenames...]".yellow, '',
21
+ "Arguments in square brackets are optional. A vertical bar '|'",
22
+ 'indicates valid alternatives for invoking the option. Prefix',
23
+ 'the command with "' + 'ruby'.yellow +
24
+ '" if it is not on your PATH.', '',
25
+ ' --help | -h | -? | ?'.green,
26
+ "\tProduce this help message.",
27
+ ' filenames...'.green,
28
+ "\tThe name[s] of the file[s] to be converted."
29
+ ]
30
+
31
+ OptionParser.new do |opts|
32
+ opts.banner = "Usage: #{$PROGRAM_NAME} [-h|--help] [filenames...[]"
33
+ opts.on('-h', '-?', '--help') { ErrorHandling.clean_abort help_msg }
34
+ end.parse!
35
+
36
+ ErrorHandling.clean_abort help_msg if ARGV[0] == '?'
37
+
38
+ $-i = '.orig'
39
+ ARGF.each { |line| puts line.strip.tr(',', ' ') }
data/exe/mser.rb ADDED
@@ -0,0 +1,71 @@
1
+ #!/usr/bin/env ruby -w
2
+
3
+ require 'rubygems' if RUBY_VERSION =~ /^1\.8/
4
+ require 'colorize'
5
+
6
+ String.disable_colorization false
7
+
8
+ require 'optparse'
9
+ require 'datafarming/error_handling'
10
+
11
+ begin
12
+ require 'quickstats'
13
+ rescue LoadError
14
+ ErrorHandling.clean_abort [
15
+ "\n\tALERT: quickstats gem is not installed!".red,
16
+ "\tIf you have network connectivity, type:",
17
+ "\n\t\tgem install quickstats\n".yellow,
18
+ "\t(Admin privileges may be required.)\n\n"
19
+ ]
20
+ end
21
+
22
+ help_msg = [
23
+ 'Calculate MSER truncation statistics for one or more input files.', '',
24
+ 'Input files should consist of one column of data per file, with or',
25
+ 'without headers. The output consists of one line per input file,',
26
+ 'comprised of the MSER-based average of the data, the number of',
27
+ 'observations used to calculate that average, and the number of',
28
+ 'observations truncated, separated by commas. Output is written',
29
+ 'to ' + 'stdout'.blue + ' in CSV format, with headers.', '',
30
+ 'Syntax:',
31
+ "\n\t#{ErrorHandling.prog_name} [--help] filenames...".yellow, '',
32
+ "Arguments in square brackets are optional. A vertical bar '|'",
33
+ 'indicates valid alternatives for invoking the option. Prefix',
34
+ 'the command with "' + 'ruby'.yellow +
35
+ '" if it is not on your PATH.', '',
36
+ ' --help | -h | -? | ?'.green,
37
+ "\tProduce this help message.",
38
+ ' filenames...'.green,
39
+ "\tThe names of two or more files containing data to be analyzed."
40
+ ]
41
+
42
+ OptionParser.new do |opts|
43
+ opts.banner = "Usage: #{$PROGRAM_NAME} [-h|--help] filenames..."
44
+ opts.on('-h', '-?', '--help') { ErrorHandling.clean_abort help_msg }
45
+ end.parse!
46
+
47
+ ErrorHandling.clean_abort help_msg if ARGV.empty? || ARGV[0] == '?'
48
+
49
+ puts 'x-bar,n,trunc'
50
+
51
+ ARGV.each do |fname|
52
+ data = File.readlines(fname)
53
+ data.shift if data[0] =~ /[A-Za-z]/ # strip header if one present
54
+ data.map! { |line| line.chomp.strip.to_f }
55
+ m_stats = QuickStats.new
56
+ warmup = [(data.length * 0.5).to_i, data.length - 10].min
57
+ index = data.length - 1
58
+ while index > (data.length - warmup) && index > 1
59
+ m_stats.new_obs(data[index])
60
+ index -= 1
61
+ end
62
+ best = [m_stats.std_err, m_stats.avg, warmup]
63
+
64
+ while index > -1
65
+ m_stats.new_obs(data[index])
66
+ best = [m_stats.std_err, m_stats.avg, index] if m_stats.std_err <= best[0]
67
+ index -= 1
68
+ end
69
+
70
+ printf "%f,%d,%d\n", best[1], data.length - best[2], best[2]
71
+ end