statsailr_procs_base 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (34) hide show
  1. checksums.yaml +7 -0
  2. data/.gitignore +10 -0
  3. data/Gemfile +10 -0
  4. data/LICENSE.txt +675 -0
  5. data/README.md +195 -0
  6. data/Rakefile +12 -0
  7. data/bin/console +15 -0
  8. data/bin/setup +8 -0
  9. data/lib/statsailr_procs_base.rb +16 -0
  10. data/lib/statsailr_procs_base/check_statsailr_version.rb +18 -0
  11. data/lib/statsailr_procs_base/path.rb +7 -0
  12. data/lib/statsailr_procs_base/proc_setting/common_utility.R +16 -0
  13. data/lib/statsailr_procs_base/proc_setting/proc_cat.R +65 -0
  14. data/lib/statsailr_procs_base/proc_setting/proc_cat.rb +44 -0
  15. data/lib/statsailr_procs_base/proc_setting/proc_common/dev_copy.R +6 -0
  16. data/lib/statsailr_procs_base/proc_setting/proc_common/dev_copy.rb +15 -0
  17. data/lib/statsailr_procs_base/proc_setting/proc_common/factor.R +7 -0
  18. data/lib/statsailr_procs_base/proc_setting/proc_common/factor.rb +15 -0
  19. data/lib/statsailr_procs_base/proc_setting/proc_common/numeric.R +7 -0
  20. data/lib/statsailr_procs_base/proc_setting/proc_common/numeric.rb +15 -0
  21. data/lib/statsailr_procs_base/proc_setting/proc_mult.R +12 -0
  22. data/lib/statsailr_procs_base/proc_setting/proc_mult.rb +35 -0
  23. data/lib/statsailr_procs_base/proc_setting/proc_plot.R +52 -0
  24. data/lib/statsailr_procs_base/proc_setting/proc_plot.rb +54 -0
  25. data/lib/statsailr_procs_base/proc_setting/proc_print.R +60 -0
  26. data/lib/statsailr_procs_base/proc_setting/proc_print.rb +44 -0
  27. data/lib/statsailr_procs_base/proc_setting/proc_reg.rb +17 -0
  28. data/lib/statsailr_procs_base/proc_setting/proc_two.R +33 -0
  29. data/lib/statsailr_procs_base/proc_setting/proc_two.rb +38 -0
  30. data/lib/statsailr_procs_base/proc_setting/proc_uni.R +105 -0
  31. data/lib/statsailr_procs_base/proc_setting/proc_uni.rb +27 -0
  32. data/lib/statsailr_procs_base/version.rb +7 -0
  33. data/statsailr_procs_base.gemspec +37 -0
  34. metadata +80 -0
data/README.md ADDED
@@ -0,0 +1,195 @@
1
+ # About statsailr_procs_base gem
2
+
3
+ This 'statsailr_procs_base' gem provides a collection of fundamental PROC settings and make PROCs available for StatSailr program. The 'statsailr' gem does not have PROC settings, and this gem is essential for StatSailr to provide a useful statistics system.
4
+
5
+ The reason why these basic PROC settings are separated as 'statsailr_procs_base' gem from the main 'statsailr' gem is maintainability. The StatSailr system (i.e. main system + PROCs) is still under development, and is updated frequently. PROCs and main system do not have specific release cycles yet. For those reasons, it is better to update them as separate gems.
6
+
7
+
8
+ ## Installation
9
+
10
+ This gem is usually installed together with 'statsailr' gem, following the dependency setting in statsailr.gemspec. If you want to try the most up-to-date version, manually install it from Github.
11
+
12
+
13
+ ## How each StatSailr's PROC functionality is defined
14
+
15
+ How each PROC block behaves or what instructions they have is defined as PROC settings. This gem provides those PROC settings. StatSailr calls PROCs providing gem's path_to_proc_setting method (StatSailr::ProcsBase::path_to_proc_setting in this gem), and uses the path information to access the gem's PROC settings. The method returns the base dicrectory path that contains PROC settings (a.k.a. lib/statsailr_procs_base/proc_setting directory). Based on the PROCs called from StatSailr, its corresponding setting files are identified and settings are loaded. Based on the settings, appropriate R functions are generated and executed.
16
+
17
+ Following these settings, StatSailr converts PROC block to a series of R functions. PROC settings are written in Ruby modules. Internally StatSailr creates a Ruby object for each block, and those objects extend their functionality by mixing-in corresponding PROC setting module. Currently, those files and module names are determined from PROC's command name. For example, if the block starts with 'PROC CAT', 'ProcCat' module written in 'proc_cat.rb' is the module to be used for functionality extension.
18
+
19
+
20
+ ## Understanding PROC settings
21
+
22
+ The following is typical PROC setting and its explanation.
23
+
24
+ ```
25
+ # proc_print.rb
26
+
27
+ module ProcPrint
28
+ def setting_for_head( setting )
29
+ setting.libname = "utils"
30
+ setting.func_name = "head"
31
+ setting.main_arg_and_how_to_treat = [ "n", "read_as_intvec", "allow_nil"]
32
+ setting.runtime_args = { "x" => param("data")}
33
+ setting.store_result = false
34
+ setting.print_opt = true
35
+ end
36
+ end
37
+ ```
38
+
39
+ * module ProcPrint
40
+ + This module is setting for 'PROC PRINT' block.
41
+ * def setting_for_head
42
+ + This function is setting for 'head' instruction.
43
+ * setting.libname
44
+ + When libname is specified, the function is looked up from the package.
45
+ + In the above example, it is equivalent to utils::head()
46
+ + When libname is set nil, the function is looked up from environment or Global.
47
+ * setting.envname
48
+ + When libname is set nil, and envname is specified, the function is looked up from this environment.
49
+ + This enables R functions to be called with environment name, and each PROC can implement its own R functions within their environment, and prevents function name conflicts.
50
+ * setting.func_name
51
+ + R's function name to be called.
52
+ + package name or environment name can be specified as mentioned above.
53
+ * setting.main_arg_and_how_to_treat
54
+ + This is spcified in Array of length 3.
55
+ + The first element is R's function argument name to be passed to.
56
+ + The second element of its value is how to treat the main argument.
57
+ + The third element of its value is whether to allow not specifying main argument. "allow_nil" or "no_nil"
58
+ + In the above example, value specified for main argument is interpreted as intvec, and is passed to R funcion's 'x' argument.
59
+ + "allow_nil" allows users to omit main argument, and default value of 6 is used. (See head documentation https://stat.ethz.ch/R-manual/R-devel/library/utils/html/head.html )
60
+ + If the instruction does not take a main argument, nil or [nil, nil, nil] is specified.
61
+ * setting.runtime_args
62
+ + This is specified in Hash or nil.
63
+ + This setting is used to pass objects that are generated at runtime to R function.
64
+ + The objects include objects spcified in PROC options and results of previous instructions within the same block.
65
+ + To define, param(), result() and one_from() methods can be used. Details are described later.
66
+ * setting.store_result
67
+ + true/false
68
+ + If this setting is true, this instruction's result can be accessed via setting.runtime_args.
69
+ + If set false, this instruction's result cannot be accessed.
70
+ * setting.print_opt
71
+ + true/false
72
+ + If this setting is true, the result is printed out using "print" function.
73
+ + If set false, the result is not printed out.
74
+ + String
75
+ + If String value is set, the result is printed out using the name specified.
76
+ * setting.plot_opt
77
+ + true/false
78
+ + Under the situation where graphics device does not show on display and tries to output to a file, this setting is valid.
79
+ + If true, StatSailr conducts dev.copy() at the end of this current instruction, and saves graphics device content to file.
80
+ + If the current graphics device outputs to display, this setting is useless and ignored.
81
+
82
+
83
+ ## More about main_arg_and_how_to_treat
84
+
85
+ The following methods can be used to parse and convert PROC instruction main argument to R object. This method name needs to be specified in String.
86
+
87
+ * read_as_formula
88
+ * read_as_strvec
89
+ * read_as_one_str
90
+ * read_as_numvec
91
+ * read_as_intvec
92
+ * read_as_realvec
93
+ * read_as_symbol
94
+ * read_symbols_as_strvec
95
+
96
+
97
+
98
+ ## More about setting.runtime_args
99
+
100
+ This setting is used to access objects that are generated at runtime, and pass those objects to R function. These objects include objects that are spcified in PROC args, and results that are returned from previous instrutions within the same block.
101
+
102
+ * param() method
103
+ + param() can access object specified in PROC option.
104
+ + (The reason why 'param' is used for this access is that PROC option parameter is managed by RBridge::ParamManager.)
105
+ + For example, "x" => param("data") means that passing an object that is named as "data" in PROC option, and passing it to "x" argument of R function.
106
+ * result() method
107
+ + result() can access a result object of an instruction that is specified by instruction name.
108
+ + If result() method takes multiple argument or multiple instruction names, it returns ResultNameArray object.
109
+ + The last result from specified instruction results is used.
110
+ * Pointer to R object
111
+ + R object that is statically generated can also be used. Usually this can be used as a default value.
112
+ * one_from() method
113
+ + one_from() method can take ResultName, ResultNameAray, PramName or pointer to R object.
114
+ + one_from() can take multiple arguments, and their order defines priority about which object to use.
115
+ + (e.g.) 'setting.runtime_args = {"data" => one_from( result("factor", "numeric"), param("data")) }' means the following
116
+ + If there are already instructions of "factor" or "numeric", use the last result of them.
117
+ + If not, object that is specified by "data" in PROC options in used.
118
+
119
+
120
+ ## Enable custom R function
121
+
122
+ StatSailr PROC instructions can also call custom R functions that are not defined in R's libraries. source_r_file() method does this work. ProcSettingModule is a module that provides source_r_file() method. source_r_file() method lets R function defined in the file available from R or StatSailr PROC instructions.
123
+
124
+ The following example enables functions in proc_scatter.R . ( The setting file name is proc_scatter.rb and File.basename(__FILE__ , ".rb") + ".R" replaces the extension with ".R".) Note that source_r_file()'s first argument takes absolute path for the directory containing the R file. This restriction prevents from loading unintentional files.
125
+
126
+ ```
127
+ # proc_scatter.rb
128
+ module ProcScatter
129
+ include ProcSettingModule
130
+ ...
131
+ source_r_file( __dir__, File.basename(__FILE__ , ".rb") + ".R")
132
+ ...
133
+ ...
134
+ end
135
+ ```
136
+
137
+ ## Validate PROC options
138
+
139
+ Each PROC option (i.e. option that follows PROC command) can be validated. Option name, option value type and option requirement can be checked, and types are converted appropriately if necessary.
140
+
141
+ For example, in many PROCs, data option is required. To guarantee the existence of data option, pass "data" with "required: true" to validate_option().
142
+
143
+ * validate_option() method takes the following arguments
144
+ + option_name : option to validate
145
+ + is_a : value assigned to the option should belong to type(s) specified.
146
+ + "Float", "Integer", "String" or "SymbolR" can be specified.
147
+ + Array can be used to allow some types.
148
+ + as : value is finally dealt as this type specified.
149
+ + required : PROC command requires the option or not
150
+
151
+ The following example specifies PROC PRINT requires 'data' option, SymbolR or String is accepted for its value, and the value is finally dealt as SymbolR.
152
+
153
+ ```
154
+ module ProcPrint
155
+ include ProcSettingModule
156
+ ...
157
+ validate_option("data", is_a: ["SymbolR", "String"], as: "SymbolR", required: true)
158
+ ```
159
+
160
+
161
+ ## Share common settings with multiple settings
162
+
163
+ StatSailr PROC instruction settings can be shared by multiple setting modules. The shared setting can be included using add_setting_from() method defined in ProcSettingModule. The included file is also written in module, and the format is almost the same.
164
+
165
+ In thee following example, ProcScatter or ProPlot?????? include DevCopySetting which provides functionality of dev.copy() function.
166
+
167
+
168
+ ```
169
+ # proc_plot.rb
170
+ module ProcPlot
171
+ include ProcSettingModule
172
+ add_setting_from( __dir__, "proc_common/dev_copy.rb" ) # This file should have DevCopy module.
173
+ ...
174
+ ...
175
+ end
176
+
177
+ # proc_common/dev_copy.rb
178
+ module DevCopySetting
179
+ def setting_for_dev_copy( setting )
180
+ ...
181
+ ...
182
+ end
183
+ end
184
+ ```
185
+
186
+ ## Contributing
187
+
188
+ Bug reports are welcome on GitHub at https://github.com/niceume/statsailr_procs_base.
189
+
190
+
191
+ ## License
192
+
193
+ The gem is available as open source under the terms of the [GPL v3 License](https://www.gnu.org/licenses/gpl-3.0.en.html).
194
+
195
+
data/Rakefile ADDED
@@ -0,0 +1,12 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "bundler/gem_tasks"
4
+ require "rake/testtask"
5
+
6
+ Rake::TestTask.new(:test) do |t|
7
+ t.libs << "test"
8
+ t.libs << "lib"
9
+ t.test_files = FileList["test/**/*_test.rb"]
10
+ end
11
+
12
+ task default: :test
data/bin/console ADDED
@@ -0,0 +1,15 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ require "bundler/setup"
5
+ require "statsailr_procs_base"
6
+
7
+ # You can add fixtures and/or initialization code here to make experimenting
8
+ # with your gem easier. You can also use a different console, if you like.
9
+
10
+ # (If you use this, don't forget to add pry to your Gemfile!)
11
+ # require "pry"
12
+ # Pry.start
13
+
14
+ require "irb"
15
+ IRB.start(__FILE__)
data/bin/setup ADDED
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,16 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "statsailr_procs_base/version"
4
+ require_relative "statsailr_procs_base/path"
5
+ require_relative "statsailr_procs_base/check_statsailr_version"
6
+
7
+ module StatSailr
8
+ module ProcsBase
9
+ class Error < StandardError; end
10
+
11
+ # Your code goes here...
12
+ end
13
+ end
14
+
15
+ StatSailr::ProcsBase::check_statsailr_version()
16
+
@@ -0,0 +1,18 @@
1
+ module StatSailr
2
+ module ProcsBase
3
+ def self.check_statsailr_version
4
+ statsailr_version = StatSailr::VERSION
5
+ statsailr_version_at_least = "0.7.1"
6
+ return version_should_be_at_least( statsailr_version , statsailr_version_at_least )
7
+ end
8
+
9
+ def self.version_should_be_at_least( gem_version, str_version )
10
+ if( Gem::Version.new(gem_version) >= Gem::Version.new(str_version) )
11
+ return true
12
+ else
13
+ puts "\e[1m" + "WARNING: statsailr gem (#{gem_version}) needs to be newer or equal to ver. #{str_version}" + "\e[22m"
14
+ return false
15
+ end
16
+ end
17
+ end
18
+ end
@@ -0,0 +1,7 @@
1
+ module StatSailr
2
+ module ProcsBase
3
+ def self.path_to_proc_setting
4
+ return __dir__ + "/" + "proc_setting"
5
+ end
6
+ end
7
+ end
@@ -0,0 +1,16 @@
1
+ if_exist_else = function( name, lst, default_val ){
2
+ if( name %in% names(lst)){
3
+ return( lst[[name]] )
4
+ }else{
5
+ return( default_val )
6
+ }
7
+ }
8
+
9
+ get_pkg_fun = function(x) {
10
+ if(length(grep("::", x))>0) {
11
+ parts<-strsplit(x, "::")[[1]]
12
+ getExportedValue(parts[1], parts[2])
13
+ } else {
14
+ x
15
+ }
16
+ }
@@ -0,0 +1,65 @@
1
+ sts_cat = new.env()
2
+
3
+ sts_cat$table = function( data , vars , missing = FALSE, ... ){
4
+ if( (! is.character(vars)) || length(vars) != 2 ){
5
+ stop("vars argument requires character vector with size of 2")
6
+ }
7
+
8
+ df_for_table = data[ , vars ]
9
+ if( missing ){
10
+ freq_ori = table(df_for_table, useNA = "ifany")
11
+ }else{
12
+ freq_ori = table(df_for_table)
13
+ }
14
+
15
+ freq = addmargins( freq_ori )
16
+ allper = format( addmargins( prop.table( freq_ori) * 100), digits = 4)
17
+ rowper = format( addmargins( prop.table( freq_ori, 1) * 100), digits = 4)
18
+ colper = format( addmargins( prop.table( freq_ori, 2) * 100), digits = 4)
19
+
20
+ nrow = nrow(freq)
21
+ ncol = ncol(freq)
22
+ nrow_ori = nrow(freq_ori)
23
+
24
+ temp = as.array( numeric( nrow * ncol * 4 ))
25
+ dim( temp ) = c( nrow * 4, ncol )
26
+ result = as.table( temp )
27
+
28
+ colnames(result) = c( colnames( freq_ori ) , "Total")
29
+
30
+ row_label1 = c( rep("%",nrow_ori ) , "%")
31
+ row_label2 = c( rep("Row%",nrow_ori ) , "")
32
+ row_label3 = c( rep("Col%",nrow_ori ) , "")
33
+
34
+ rownames(result) = as.vector( mapply( c, c( rownames( freq_ori ), "Total" ), row_label1 , row_label2 , row_label3 ))
35
+
36
+ cat( paste( "\t", names(freq), "\n", sep = "\t", collapse = "\t" ))
37
+ for( irow in seq(1, nrow) ){
38
+ for( icol in seq(1, ncol) ){
39
+ result[ (irow-1)*4 + 1, icol] = freq[irow, icol]
40
+ }
41
+ for( icol in seq(1, ncol) ){
42
+ result[ (irow-1)*4 + 2, icol] = allper[irow, icol]
43
+ }
44
+ for( icol in seq(1, ncol) ){
45
+ if(irow != nrow && icol != ncol){
46
+ result[ (irow-1)*4 + 3, icol] = rowper[irow, icol]
47
+ }else{
48
+ result[ (irow-1)*4 + 3, icol] = NA
49
+ }
50
+ }
51
+ for( icol in seq(1, ncol) ){
52
+ if(icol != ncol && irow!= nrow){
53
+ result[ (irow-1)*4 + 4, icol] = colper[irow, icol]
54
+ }else{
55
+ result[ (irow-1)*4 + 4, icol] = NA
56
+ }
57
+ }
58
+ }
59
+
60
+ cat( paste( vars[1] , " vs ", vars[2], "\n" ) )
61
+ print( result )
62
+
63
+ return( freq_ori )
64
+ }
65
+
@@ -0,0 +1,44 @@
1
+ module ProcCat
2
+ include ProcSettingModule
3
+
4
+ source_r_file( __dir__, File.basename(__FILE__ , ".rb") + ".R")
5
+ validate_option("data", is_a: ["SymbolR", "String"], as: "SymbolR" , required: true)
6
+
7
+ def setting_for_table( setting )
8
+ setting.libname = nil
9
+ setting.envname = "sts_cat"
10
+ setting.func_name = "table"
11
+ setting.main_arg_and_how_to_treat = ["vars", :read_as_strvec, :no_nil]
12
+ setting.runtime_args = {"data" => param("data") }
13
+ setting.store_result = true
14
+ setting.print_opt = false
15
+ end
16
+
17
+ def setting_for_xtabs( setting )
18
+ setting.libname = "stats"
19
+ setting.func_name = "xtabs"
20
+ setting.main_arg_and_how_to_treat = ["formula", :read_as_formula, :no_nil]
21
+ setting.runtime_args = {"data" => param("data") }
22
+ setting.store_result = true
23
+ setting.print_opt = true
24
+ end
25
+
26
+ def setting_for_fisher_test( setting )
27
+ setting.libname = "stats"
28
+ setting.func_name = "fisher.test"
29
+ setting.main_arg_and_how_to_treat = nil
30
+ setting.runtime_args = {"x" => one_from( result("table"), result("xtabs") ) }
31
+ setting.store_result = true
32
+ setting.print_opt = true
33
+ end
34
+
35
+ def setting_for_chisq_test( setting )
36
+ setting.libname = "stats"
37
+ setting.func_name = "chisq.test"
38
+ setting.main_arg_and_how_to_treat = nil
39
+ setting.runtime_args = {"x" => one_from( result("table"), result("xtabs") ) }
40
+ setting.store_result = true
41
+ setting.print_opt = true
42
+ end
43
+ end
44
+
@@ -0,0 +1,6 @@
1
+ sts_dev_copy = new.env()
2
+
3
+ sts_dev_copy$dev_copy = function( device, ... ){
4
+ dev.copy( device , ...)
5
+ dev.off()
6
+ }
@@ -0,0 +1,15 @@
1
+ module DevCopySetting
2
+ include ProcSettingModule
3
+ source_r_file(__dir__, File.basename(__FILE__ , ".rb") + ".R")
4
+
5
+ def setting_for_dev_copy( setting )
6
+ setting.libname = nil
7
+ setting.envname = "sts_dev_copy"
8
+ setting.func_name = "dev_copy"
9
+ setting.main_arg_and_how_to_treat = [ "device" , :read_as_symbol , :allow_nil ]
10
+ setting.runtime_args = {}
11
+ setting.store_result = false
12
+ setting.print_opt = false
13
+ end
14
+ end
15
+
@@ -0,0 +1,7 @@
1
+ sts_factor = new.env()
2
+
3
+ sts_factor$convert_to_factor = function( data, vars ){
4
+ data[vars] = lapply(data[vars], as.factor)
5
+ print( paste( "as.factor is applied to", paste( vars, collapse=",") , sep=" ") )
6
+ return( data )
7
+ }