statsailr 0.7.1 → 0.7.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (4) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +20 -19
  3. data/lib/statsailr/version.rb +1 -1
  4. metadata +2 -2
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 51e84bd76b9ed92c47e11f3baac54ade7ef338b4ab8a1935e5aaac106f559c65
4
- data.tar.gz: 2f68acb064cdb8297052ee9246a49da73db1a70750d2290b83eebe2e21c15db2
3
+ metadata.gz: 428f469213d12c0613eb7eeab1690cad296ac94cff448bea96f140b55f2d8a24
4
+ data.tar.gz: 684382167f17c762e22c752ac3303ec6a5666cbf83f02ef672f515167e43d2b6
5
5
  SHA512:
6
- metadata.gz: c1c37162684765074a27dd2a2ec2373c939ab5b125df6ae5a7ba1bd86dfd66fe9ece1e5f39a525db403a296daae8cf594687b8ba4d405cff78202472324b1389
7
- data.tar.gz: 4f0908c784e59eefec18664ba79ea33cc24148ac3c5de2a005b64553a2c9fd5a48d60d9821140f3c3b2065ea757c4f8700ef1d8e2ff99171a175ed137c747cdc
6
+ metadata.gz: 64352807559d052da5415f75f0917f629146cf5b9671bdce7b294720795404b954f16569930d5ad6d68c40465a46efc2ab2a2d9e4d5d9f18b2673926d98045e5
7
+ data.tar.gz: 4bf58f63503187e3e8a33aae1485ef226123c03e8741593c2f6cf434e90d24b055d119b921edf8ccc69f43e5d93398ec46fcb6067eb2f1be4c9cd9f4c192361f
data/README.md CHANGED
@@ -1,16 +1,16 @@
1
1
  # StatSailr
2
2
 
3
- StatSailr provides a platform for users to focus on statistics. The backend statistics engine is [R](https://www.r-project.org/), so the results are reliable. The SataSailr script consists of three major blocks, TOPLEVEL, DATA, and PROC. Each block has its way of writing instructions, which works as an intuitive interface for R.
3
+ StatSailr provides a platform for users to focus on statistics. The backend statistics engine is [R](https://www.r-project.org/), so the results are reliable. The StatSailr script consists of three major blocks, TOPLEVEL, DATA, and PROC. Each block has its way of writing instructions, which works as an intuitive interface for R.
4
4
 
5
5
 
6
6
  ## Overview
7
7
 
8
- StatSailr is a Ruby program that enables users to manipulate data and to apply statistical procedures in an intuiitive way. StatSailr converts StatSailr script into R's internal representation, and executes it. The SataSailr script consists of three major blocks, TOPLEVEL, DATA, and PROC. TOPLEVEL loads and saves datasets. DATA blocks utilize DataSailr package as its backend, which enables wrting data manipulation insturctions in a rowwise way. PROC blocks have a series of PROC instructions, which are converted to R functions and are executed sequentially.
8
+ StatSailr is a Ruby program that enables users to manipulate data and to apply statistical procedures in an intuitive way. StatSailr converts StatSailr script into R's internal representation, and executes it. The SataSailr script consists of three major blocks, TOPLEVEL, DATA, and PROC. TOPLEVEL loads and saves datasets. DATA blocks utilize DataSailr package as its backend, which enables wrting data manipulation instructions in a rowwise way. PROC blocks have a series of PROC instructions, which are converted to R functions and are executed sequentially.
9
9
 
10
10
 
11
11
  ### Quick Introduction
12
12
 
13
- The following is an example of StatSailr script. It consists of TOPLEVEL instruction, DATA block and PROC blocks.
13
+ The following is an example of StatSailr script. It consists of TOPLEVEL, DATA and PROC blocks.
14
14
 
15
15
  ```
16
16
  READ builtin="mtcars"
@@ -30,6 +30,8 @@ END
30
30
  PROC REG data=new_mtcars
31
31
  lm hp ~ powerful
32
32
  END
33
+
34
+ SAVE new_mtcars file="./new_mtcars.rda" type="rdata"
33
35
  ```
34
36
 
35
37
  Save this script as, say, create_new_mtcars.slr and run.
@@ -124,7 +126,7 @@ $ gem install statsailr
124
126
  * Then, 'sailr' and 'sailrREPL' become available.
125
127
 
126
128
 
127
- ## Grammar of StatSalr
129
+ ## Grammar of StatSailr
128
130
 
129
131
  StatSailr script consists of three parts, TOPLEVEL, DATA block and PROC block.
130
132
 
@@ -140,7 +142,7 @@ TOPLEVEL statements import and save datasets, and also StatSailr's current worki
140
142
 
141
143
  Datasets can come from built-in datasets and files. In R, built-in datasets can be used by data() function, and StatSailr READ with 'builtin=' option does the same job.
142
144
 
143
- When importing datasets from files, currently there are three types of files availble, RDS, RDATA and CSV. RDS contains a single R object, and when you import it, you can neme the object using 'as=' option. If you omit 'as=' option, the name is created based on the filename. RDATA can contain multiple R objects, but their names cannot be changed when importing that are decided when saveing. CSV is a comma separated values file.
145
+ When importing datasets from files, currently there are three types of files available, RDS, RDATA and CSV. RDS contains a single R object, and when you import it, you can name the object using 'as=' option. If you omit 'as=' option, the name is created based on the filename. RDATA can contain multiple R objects, but their names cannot be changed when importing that are decided when saving. CSV is a comma separated values file.
144
146
 
145
147
  These dataset types are decided as follows. If you specify 'type' option, its type is used. If you do not specify it, it is inferred from the file extension.
146
148
 
@@ -162,7 +164,7 @@ SAVE new_mtcars file="./new_mtcars.csv" type="csv"
162
164
 
163
165
  * show and change working directory
164
166
 
165
- The concept of working directory is really important. If you run your SailrScript in an unintentional place and ouput some data, those data might overwrite your importan data.
167
+ The concept of working directory is really important. If you run your StatSailr script in an unintentional place and output some data, those data might overwrite your important data.
166
168
 
167
169
  The default working directory should be the directory where StatSailr script file exists. If you do not specify script file, such as when you run REPL, the default working directory should be the directory where you start your command (such as REPL).
168
170
 
@@ -176,7 +178,7 @@ SETWD "~/sailr_workspace"
176
178
 
177
179
  ### DATA block
178
180
 
179
- DATA block starts with the line of DATA, new dataset name and DATA options. For DATA options, 'set=' option is required which specify the input dataset. (Note that unlinke PROC options where 'data=' usually speifies input dataset, "set=" does the same job in DATA block. This difference comes from just an aesthetic reason.) Lines that follws the first DATA line represent how to manipulate input dataset. The lines are writtein in DataSailr script. END keyword specifies the end of DATA block.
181
+ DATA block starts with the line of DATA, new dataset name and DATA options. For DATA options, 'set=' option is required which specify the input dataset. (Note that unlink PROC options where 'data=' usually specifies input dataset, "set=" does the same job in DATA block. This difference comes from just an aesthetic reason.) Lines that follws the first DATA line represent how to manipulate input dataset. The lines are written in DataSailr script. END keyword specifies the end of DATA block.
180
182
 
181
183
  ```
182
184
  DATA new_dataset set=ori_dataset
@@ -194,24 +196,23 @@ The DataSailr script is described in detail at [its official website](https://da
194
196
 
195
197
  Briefly speaking,
196
198
 
197
- 1. Rowwise dataset manipulation
198
- + Varables correspond to column names.
199
+ 1. Row by row dataset processing
200
+ + Variable names correspond to column names.
199
201
  2. Simplified available types
200
- + Int, Double and String(=Characters) are basic types, that can be used in DataSailr script and also those values can be assigned to column value (of dataset).
201
- + Regular expression and boolean are not assigned to dataset. They can be held by variables, but do not modify dataset.
202
+ + Int, Double and String are basic types, that can be used in DataSailr script and also those values can be assigned to data sets.
203
+ + Regular expression and Boolean do not affect dataset.
202
204
  + Regular expression is used for if condition and extracting substrings.
203
205
  + Boolean is internal type that is used for if condition.
204
- 3. Assignment operator (=) creates new column with the column name same as the variable left-hand-side(LHS) of assignment operator.
206
+ 3. Assignment operator (=) creates new column or updates the existing column with the same column name as the variable left-hand-side of assignment operator.
205
207
  + If the variable already exits, the column is updated.
206
- + Exceptions are assigning regular expressions and boolen, which do not modify dataset. Variables pointing to those objects are only used in the script.
207
- 4. Control flow can be done using if-(else if)-(else) statement.
208
- + Condition part needs parentheses (), and statement part require curly braces.
208
+ + Exceptions are assigning Regular Expressions, which do not modify dataset. This is used to reuse them at different lines of code.
209
+ 4. If-else statement is the only control flow statement.
209
210
  5. Arithmetic operators
210
211
  6. Built-in functions
211
212
  + Mainly used to manipulate strings.
212
- 7. Regular expression
213
+ 7. Regular Expression
213
214
  8. UTF-8
214
- + Use UTF-8 for script and dataset. It is highly recommended that dataset should be saved using UTF-8 beforehand.
215
+ + It is highly recommended t use UTF-8 for script and dataset. Data set needs to be saved in UTF-8 beforehand.
215
216
  9. push!() and discard!() built-in functions
216
217
  + push!() can create multiple rows from current row.
217
218
  + discard!() can filter out specific rows by being used with if statements.
@@ -228,7 +229,7 @@ The PROCs gem holds basic PROC settings, such as PRINT and PLOT, and its main cl
228
229
 
229
230
  #### Format
230
231
 
231
- A typical PROC block looks like the follwing. The first line start with PROC, followed by PROC command name and PROC options. The PROC first line is followed by a list of instuctions with their main and optional arguments. The PROC block ends with END keyword.
232
+ A typical PROC block looks like the following. The first line start with PROC, followed by PROC command name and PROC options. The PROC first line is followed by a list of instructions with their main and optional arguments. The PROC block ends with END keyword.
232
233
 
233
234
  ```
234
235
  PROC COMMAND proc_opts
@@ -241,7 +242,7 @@ END
241
242
  * COMMAND
242
243
  + PROC command name
243
244
  * proc_opts
244
- + This parameter can be refered from any instructions in this block. In other words, this can be seen as global settings of this PROC block.
245
+ + This parameter can be referred from any instructions in this block. In other words, this can be seen as global settings of this PROC block.
245
246
  + Internally, this parameter is managed by RBridge::ParamManager.
246
247
  * proc_statement line
247
248
  + Each line consists of instruction, main argument and optional arguments. Main argument and optional arguments are separated by slash(/).
@@ -1,3 +1,3 @@
1
1
  module StatSailr
2
- VERSION = "0.7.1"
2
+ VERSION = "0.7.2"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: statsailr
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.7.1
4
+ version: 0.7.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Toshihiro Umehara
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2021-02-22 00:00:00.000000000 Z
11
+ date: 2021-02-24 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: r_bridge