RubyGems - galaaz - Versions diffs - 0.4.9 → 0.4.10 - Mend

galaaz 0.4.9 → 0.4.10

Files changed (76) hide show

checksums.yaml +4 -4
data/README.md +798 -285
data/blogs/galaaz_ggplot/galaaz_ggplot.Rmd +3 -12
data/blogs/galaaz_ggplot/galaaz_ggplot.aux +5 -7
data/blogs/galaaz_ggplot/galaaz_ggplot.html +69 -29
data/blogs/galaaz_ggplot/galaaz_ggplot.pdf +0 -0
data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-html/midwest_rb.png +0 -0
data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-html/scatter_plot_rb.png +0 -0
data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-latex/midwest_rb.pdf +0 -0
data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-latex/scatter_plot_rb.pdf +0 -0
data/blogs/galaaz_ggplot/midwest.Rmd +1 -9
data/blogs/gknit/gknit.Rmd +37 -40
data/blogs/gknit/gknit.html +32 -30
data/blogs/gknit/gknit.md +36 -37
data/blogs/gknit/gknit.pdf +0 -0
data/blogs/gknit/gknit.tex +35 -37
data/blogs/manual/manual.Rmd +548 -125
data/blogs/manual/manual.html +509 -286
data/blogs/manual/manual.md +798 -285
data/blogs/manual/manual.pdf +0 -0
data/blogs/manual/manual.tex +2816 -0
data/blogs/manual/manual_files/figure-latex/diverging_bar.pdf +0 -0
data/blogs/nse_dplyr/nse_dplyr.Rmd +240 -74
data/blogs/nse_dplyr/nse_dplyr.html +191 -87
data/blogs/nse_dplyr/nse_dplyr.md +361 -107
data/blogs/nse_dplyr/nse_dplyr.pdf +0 -0
data/blogs/nse_dplyr/nse_dplyr.tex +1373 -0
data/blogs/ruby_plot/ruby_plot.Rmd +61 -81
data/blogs/ruby_plot/ruby_plot.html +54 -57
data/blogs/ruby_plot/ruby_plot.md +48 -67
data/blogs/ruby_plot/ruby_plot.pdf +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/dose_len.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_delivery.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_dose.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color2.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_jitter.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_points.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/final_box_plot.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/final_violin_plot.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/violin_with_jitter.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/dose_len.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/facet_by_delivery.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/facet_by_dose.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_by_delivery_color.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_by_delivery_color2.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_with_decorations.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_with_jitter.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_with_points.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/final_box_plot.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/final_violin_plot.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/violin_with_jitter.png +0 -0
data/lib/R_interface/rdata_frame.rb +0 -12
data/lib/R_interface/robject.rb +14 -14
data/lib/R_interface/ruby_extensions.rb +3 -31
data/lib/R_interface/rvector.rb +0 -12
data/lib/gknit/knitr_engine.rb +5 -3
data/lib/util/exec_ruby.rb +22 -61
data/specs/tmp.rb +26 -12
data/version.rb +1 -1
metadata +22 -17
data/bin/gknit_old_r +0 -236
data/blogs/dev/dev.Rmd +0 -23
data/blogs/dev/dev.md +0 -58
data/blogs/dev/dev2.Rmd +0 -65
data/blogs/dev/model.rb +0 -41
data/blogs/dplyr/dplyr.Rmd +0 -29
data/blogs/dplyr/dplyr.html +0 -433
data/blogs/dplyr/dplyr.md +0 -58
data/blogs/dplyr/dplyr.rb +0 -63
data/blogs/galaaz_ggplot/galaaz_ggplot.log +0 -640
data/blogs/galaaz_ggplot/galaaz_ggplot.md +0 -431
data/blogs/galaaz_ggplot/galaaz_ggplot.tex +0 -481
data/blogs/galaaz_ggplot/midwest.png +0 -0
data/blogs/galaaz_ggplot/scatter_plot.png +0 -0
data/blogs/ruby_plot/ruby_plot.tex +0 -1077

data/blogs/nse_dplyr/nse_dplyr.md CHANGED

@@ -4,7 +4,7 @@ author:
     - "Rodrigo Botafogo"
     - "Daniel Mossé - University of Pittsburgh"
 tags: [Tech, Data Science, Ruby, R, GraalVM]
-date: "20/02/2019"
+date: "10/05/2019"
 output:
   html_document:
     self_contained: true
@@ -13,27 +13,41 @@ output:
     includes:
       in_header: ["../../sty/galaaz.sty"]
     number_sections: yes
+    toc: true
+    toc_depth: 2
+  md_document:
+    variant: markdown_github
+fontsize: 11pt
 ---
 # Introduction
-In this post we will see how to program with dplyr in Galaaz.
+In this post we will see how to program with _dplyr_ in Galaaz.
-### But first, what is Galaaz??
+## But first, what is Galaaz??
 Galaaz is a system for tightly coupling Ruby and R.  Ruby is a powerful language, with
 a large community, a very large set of libraries and great for web development.  However,
 it lacks libraries for data science, statistics, scientific plotting and machine learning.
 On the other hand, R is considered one of the most powerful languages for solving all of the
 above problems.  Maybe the strongest competitor to R is Python with libraries such as NumPy,
-Panda, SciPy, SciKit-Learn and a couple more.
+Pandas, SciPy, SciKit-Learn and many more.
 With Galaaz we do not intend to re-implement any of the scientific libraries in R. However, we
 allow for very tight coupling between the two languages to the point that the Ruby
-developer does not need to know that there is an R engine running.  For this to happen we
-use new technologies provided by Oracle: GraalVM, TruffleRuby and FastR:
+developer does not need to know that there is an R engine running.  Also, from the point of
+view of the R user/developer Galaaz looks a lot like R, with just minor syntactic difference,
+so there is almost no learning courve for the R developer. And as we will see in this
+post, programming with _dplyr_ is easier in Galaaz than in R.
+R users are probably quite knowledgeable about _dplyr_, for the Ruby developer, _dplyr_ and
+the _tidyverse_ libraries are a set of libraries for data manipulation in R, developed by
+Hardley Wickham, chief scientis at RStudio and a prolific R coder and writer.
+For the coupling of Ruby and R we use new technologies provided by Oracle: GraalVM,
+TruffleRuby and FastR:
      GraalVM is a universal virtual machine for running applications
      written in JavaScript, Python 3, Ruby, R, JVM-based languages like Java,
@@ -68,10 +82,16 @@ Interested readers should also check out the following sites:
 * [TruffleRuby](https://github.com/oracle/truffleruby)
 * [FastR](https://github.com/oracle/fastr)
 * [Faster R with FastR](https://medium.com/graalvm/faster-r-with-fastr-4b8db0e0dceb)
+* [How to make Beautiful Ruby Plots with Galaaz](https://medium.freecodecamp.org/how-to-make-beautiful-ruby-plots-with-galaaz-320848058857)
+* [Ruby Plotting with Galaaz: An example of tightly coupling Ruby and R in GraalVM](https://towardsdatascience.com/ruby-plotting-with-galaaz-an-example-of-tightly-coupling-ruby-and-r-in-graalvm-520b69e21021)
+* [How to do reproducible research in Ruby with gKnit](https://towardsdatascience.com/how-to-do-reproducible-research-in-ruby-with-gknit-c26d2684d64e)
+* [R for Data Science](https://r4ds.had.co.nz/)
+* [Advanced R](https://adv-r.hadley.nz/)
-### Now to programming with dplyr
+## Programming with dplyr
-According to Hardley (https://dplyr.tidyverse.org/articles/programming.html)
+This post will follow closely the work done in https://dplyr.tidyverse.org/articles/programming.html,
+by Hardley Wickham. In it, Hardley states:
 > Most dplyr functions use non-standard evaluation (NSE). This is a catch-all term that
 > means they don’t follow the usual R rules of evaluation. Instead, they capture the
@@ -80,7 +100,7 @@ According to Hardley (https://dplyr.tidyverse.org/articles/programming.html)
 > Operations on data frames can be expressed succinctly because you don’t need to repeat
 > the name of the data frame. For example, you can write filter(df, x == 1, y == 2, z == 3)
-> instead of df[df$x == 1 & df$y ==2 & df$z == 3, ].
+> instead of df[df\$x == 1 & df\$y ==2 & df\$z == 3, ].
 > dplyr can choose to compute results in a different way to base R. This is important for
 > database backends because dplyr itself doesn’t do any work, but instead generates the SQL
@@ -92,29 +112,9 @@ According to Hardley (https://dplyr.tidyverse.org/articles/programming.html)
 > with a seemingly equivalent object that you’ve defined elsewhere. In other words, this code:
 ```r
 df <- data.frame(x = 1:3, y = 3:1)
-print(df)
-```
-```
-##   x y
-## 1 1 3
-## 2 2 2
-## 3 3 1
-```
-```r
 print(filter(df, x == 1))
-```
-```
-##   x y
-## 1 1 3
-```
-```r
 #> # A tibble: 1 x 2
 #>       x     y
 #>   <int> <int>
@@ -131,15 +131,22 @@ filter(df, my_var == 1)
 ```
 > This makes it hard to create functions with arguments that change how dplyr verbs are computed.
+In this post we will see that programming with _dplyr_ in Galaaz does not require knowledge of
+non-standard evaluation in R and can be accomplished by utilizing normal Ruby constructs.
 # Writing Expressions in Galaaz
-Galaaz extends Ruby to work with complex expressions, similar to R's expressions build with 'quote'
-(base R) or 'quo' (tidyverse).  Let's take a look at some of those expressions.
+Galaaz extends Ruby to work with expressions, similar to R's expressions build with 'quote'
+(base R) or 'quo' (tidyverse).  Expressions in this context are like mathematical expressions or
+formulae.  For instance, in mathematics, the expression $y = sin(x)$ describes a function but cannot
+be computed unless the value of $x$ is bound to some value.
+Let's take a look at some of those expressions in Ruby:
 ## Expressions from operators
-The code bellow
-creates an expression summing two symbols
+The code bellow creates an expression summing two symbols. Note that :a and :b are Ruby symbols and
+are not bound to any value at the time of expression definition:
 ```ruby
@@ -150,7 +157,7 @@ puts exp1
 ```
 ## a + b
 ```
-We can build any complex mathematical expression
+We can build any complex mathematical expression such as:
 ```ruby
@@ -161,8 +168,9 @@ puts exp2
 ```
 ## (a + b) * 2 + c^2L/z
 ```
+The 'L' after two indicates that 2 is an integer.
-It is also possible to use inequality operators in building expressions
+It is also possible to use inequality operators in building expressions:
 ```ruby
@@ -173,6 +181,19 @@ puts exp3
 ```
 ## a + b >= z
 ```
+Expressions' definition can also make use of normal Ruby variables without any problem:
+```ruby
+x = 20
+y = 30
+exp_var = (:a + :b) * x <= :z - y
+puts exp_var
+```
+```
+## (a + b) * 20L <= z - 30L
+```
 Galaaz provides both symbolic representations for operators, such as (>, <, !=) as functional
 notation for those operators such as (.gt, .ge, etc.).  So the same expression written
@@ -188,8 +209,9 @@ puts exp4
 ## a + b >= z
 ```
-Two type of expression can only be created with the functional representation of the operators,
-those are expressions involving '==', and '='.  In order to write an expression involving '==' we
+Two type of expression, however, can only be created with the functional representation
+of the operators,  those are expressions involving '==', and '='.  In order to write an
+expression involving '==' we
 need to use the method '.eq' and for '=' we need the function '.assign'
@@ -228,17 +250,16 @@ puts exp_wrong
 ```
 and it might be difficult to understand what is going on here.  The problem lies with the fact that
 when using '==' we are comparing expression (:a + :b) to expression :z with '=='.  When the
-comparison is executed, the system tries to evaluate :a, :b and :z, and those symbols, at
-this time are not bound to anything and we get a "object 'a' not found" message.
-If we only use functional notation, this type of error will never occur.
+comparison is executed, the system tries to evaluate :a, :b and :z, and those symbols at
+this time are not bound to anything and we get a "object 'a' not found" message.
+If we only use functional notation, this type of error will not occur.
 ## Expressions with R methods
 It is often necessary to create an expression that uses a method or function.  For instance, in
 mathematics, it's quite natural to write an expressin such as $y = sin(x)$. In this case, the
-'sin' function is part of the expression and should not immediately executed. Now, let's say
-that 'x' is an angle of 45$^\circ$ and we acttually want our expression to be $y = 0.850...$.
-When we want the function to be part of the expression, we call the function preceeding it
+'sin' function is part of the expression and should not immediately be executed. When we want
+the function to be part of the expression, we call the function preceeding it
 by the letter E, such as 'E.sin(x)'
@@ -250,28 +271,144 @@ puts exp7
 ```
 ## y <- sin(x)
 ```
-However, if we want the function to be evaluated, then
-we use the normal call to function with R as 'R.sin(x)'.
+Expressions can also be written using '.' notation:
 ```ruby
-x = 45
-exp8 = :y.assign R.sin(x)
+exp8 = :y.assign :x.sin
 puts exp8
 ```
+```
+## y <- sin(x)
+```
+When a function has multiple arguments, the first one can be used before the '.':
+```ruby
+exp9 = :x.c(:y)
+puts exp9
+```
+```
+## c(x, y)
+```
+## Evaluating an Expression
+Expressions can be evaluated by calling function 'eval' with a binding. A binding can be provided
+with a list:
+```ruby
+exp = (:a + :b) * 2.0 + :c ** 2 / :z
+puts exp.eval(R.list(a: 10, b: 20, c: 30, z: 40))
+```
+```
+## [1] 82.5
+```
+... with a data frame:
+```ruby
+df = R.data__frame(
+  a: R.c(1, 2, 3),
+  b: R.c(10, 20, 30),
+  c: R.c(100, 200, 300),
+  z: R.c(1000, 2000, 3000))
+puts exp.eval(df)
+```
+```
+## [1] 32 64 96
+```
+# Using Galaaz to call R functions
+Galaaz tries to emulate as closely as possible the way R functions are called and migrating from
+R to Galaaz should be quite easy requiring only minor syntactic changes to an R script.  In
+this post, we do not have enough space to write a complete manual on Galaaz
+(a short manual can be found at: https://www.rubydoc.info/gems/galaaz/0.4.9), so we will
+present only a few examples scripts using Galaaz.
+Basically, to call an R function from Ruby with Galaaz, one only needs to preceed the function
+with 'R.'.  For instance, to create a vector in R, the 'c' function is used.  From Galaaz, a
+vector can be created by using 'R.c':
+```ruby
+vec = R.c(1.0, 2, 3)
+puts vec
+```
+```
+## [1] 1 2 3
+```
+A list is created in R with the 'list' function, so in Galaaz we do:
+```ruby
+list = R.list(a: 1.0, b: 2, c: 3)
+puts list
+```
+```
+## $a
+## [1] 1
+##
+## $b
+## [1] 2
+##
+## $c
+## [1] 3
+```
+Note that we can use named arguments in our list.  The same code in R would be:
+```r
+lst = list(a = 1, b = 2L, c = 3L)
+print(lst)
+```
+```
+## $a
+## [1] 1
+##
+## $b
+## [1] 2
+##
+## $c
+## [1] 3
+```
+Now, let's say that 'x' is an angle of 45$^\circ$ and we acttually want to create
+the expression $y = sin(45^\circ)$, which is $y = 0.850...$.  In this case,
+we will use 'R.sin':
+```ruby
+exp10 = :y.assign R.sin(45)
+puts exp10
+```
 ```
 ## y <- 0.850903524534118
 ```
 # Filtering using expressions
-Now that we now how to write expression, we can use then to filter a data frame by expressions.
-Let's first start by creating a simple data frame with two columns named 'x' and 'y'
+Now that we know how to write expression and call R functions let's do some data manipulation in
+Galaaz.  Let's first start by creating the same data frame that we created previously in section
+"Programming with dplyr":
 ```ruby
-@df = R.data__frame(x: (1..3), y: (3..1))
-puts @df
+df = R.data__frame(x: (1..3), y: (3..1))
+puts df
 ```
 ```
@@ -280,12 +417,17 @@ puts @df
 ## 2 2 2
 ## 3 3 1
 ```
-In the code bellow we want to filter the data frame by rows in which the value of 'x' is
-equal to 1.
+The 'filter' function can be called on this data frame either by using 'R.filter(df, ...)' or
+by using dot notation.  We prefer to use dot notation as shown bellow.  The argument to 'filter'
+in Galaaz should be an expression. Note that if we gave to filter a Ruby expression such as
+'x == 1', we would get an error, since there is no variable 'x' defined and if 'x' was a variable
+then 'x == 1' would either be 'true' or 'false'. Our goal is to filter our data frame returning
+all rows in which the 'x' value is equal to 1. To express this we want: ':x.eq 1', where :x will
+be interpreted by filter as the 'x' column.
 ```ruby
-puts @df.filter(:x.eq 1)
+puts df.filter(:x.eq 1)
 ```
 ```
@@ -294,7 +436,7 @@ puts @df.filter(:x.eq 1)
 ```
 In R, and when coding with 'tidyverse', arguments to a function are usually not
-*referencially transparent*. That is, ou can’t replace a value with a seemingly equivalent
+*referencially transparent*. That is, you can’t replace a value with a seemingly equivalent
 object that you’ve defined elsewhere. In other words, this code
@@ -304,8 +446,8 @@ filter(df, my_var == 1)
 ```
 Generates the following error: "object 'x' not found.
-However, in Ruby and Galaaz, arguments are referencially transparent as can be seen by the
-code bellow.  Note, initally that 'my_var = :x' will not give the error "object 'x' not found"
+However, in Galaaz, arguments are referencially transparent as can be seen by the
+code bellow.  Note initally that 'my_var = :x' will not give the error "object 'x' not found"
 since ':x' is treated as an expression and assigned to my\_var. Then when doing (my\_var.eq 1),
 my\_var is a variable that resolves to ':x' and it becomes equivalent to (:x.eq 1) which is
 what we want.
@@ -313,7 +455,7 @@ what we want.
 ```ruby
 my_var = :x
-puts @df.filter(my_var.eq 1)
+puts df.filter(my_var.eq 1)
 ```
 ```
@@ -333,17 +475,17 @@ df[x == y, ]
 ```
 In galaaz this ambiguity does not exist, filter(df, x.eq y) is not a valid expression as
 expressions are build with symbols.  In doing filter(df, :x.eq y) we are looking for elements
-of the 'x' column that are equal to a previously defined y variable.  Finally,
+of the 'x' column that are equal to a previously defined y variable.  Finally in
 filter(df, :x.eq :y) we are looking for elements in which the 'x' column value is equal to
 the 'y' column value. This can be seen in the following two chunks of code:
 ```ruby
-@y = 1
-@x = 2
+y = 1
+x = 2
 # looking for values where the 'x' column is equal to the 'y' column
-puts @df.filter(:x.eq :y)
+puts df.filter(:x.eq :y)
 ```
 ```
@@ -355,7 +497,7 @@ puts @df.filter(:x.eq :y)
 ```ruby
 # looking for values where the 'x' column is equal to the 'y' variable
 # in this case, the number 1
-puts @df.filter(:x.eq @y)
+puts df.filter(:x.eq y)
 ```
 ```
@@ -364,7 +506,11 @@ puts @df.filter(:x.eq @y)
 ```
 # Writing a function that applies to different data sets
+Let's suppose that we want to write a function that receives as the first argument a data frame
+and as second argument an expression that adds a column to the data frame that is equal to the
+sum of elements in column 'a' plus 'x'.
+Here is the intended behaviour using the 'mutate' function of 'dplyr':
 ```
 mutate(df1, y = a + x)
@@ -372,8 +518,18 @@ mutate(df2, y = a + x)
 mutate(df3, y = a + x)
 mutate(df4, y = a + x)
 ```
+The naive approach to writing an R function to solve this problem is:
+```
+mutate_y <- function(df) {
+  mutate(df, y = a + x)
+}
+```
+Unfortunately, in R, this function can fail silently if one of the variables isn’t present
+in the data frame, but is present in the global environment.  We will not go through here how
+to solve this problem in R.
-Here we create a mutate_y Ruby method.
+In Galaaz the method mutate_y bellow will work fine and will never fail silently.
 ```ruby
@@ -381,14 +537,27 @@ def mutate_y(df)
   df.mutate(:y.assign :a + :x)
 end
 ```
-Note that contrary to what happens in R, method mutate_y will fail independetly from the fact
-that variable 'a' is defined or not.
+Here we create a data frame that has only one column named 'x':
 ```ruby
 df1 = R.data__frame(x: (1..3))
 puts df1
+```
+```
+##   x
+## 1 1
+## 2 2
+## 3 3
+```
+Note that method mutate_y will fail independetly from the fact that variable 'a' is defined and
+in the scope of the method.  Variable 'a' has no relationship with the symbol ':a' used in the
+definition of 'mutate\_y' above:
+```ruby
 a = 10
 mutate_y(df1)
 ```
@@ -402,11 +571,17 @@ mutate_y(df1)
 ##   mismatched protect/unprotect (unprotect with empty protect stack) (RError)
 ## Translated to internal error
 ```
 # Different expressions
+Let's move to the next problem as presented by Hardley where trying to write a function in R
+that will receive two argumens, the first a variable and the second an expression is not trivial.
+Bellow we create a data frame and we want to write a function that groups data by a variable and
+summarises it by an expression:
 ```r
+set.seed(123)
 df <- data.frame(
   g1 = c(1, 1, 2, 2, 2),
   g2 = c(1, 2, 1, 2, 1),
@@ -414,6 +589,19 @@ df <- data.frame(
   b = sample(5)
 )
+as.data.frame(df)
+```
+```
+##   g1 g2 a b
+## 1  1  1 2 1
+## 2  1  2 4 3
+## 3  2  1 5 4
+## 4  2  2 3 2
+## 5  2  1 1 5
+```
+```r
 d2 <- df %>%
   group_by(g1) %>%
   summarise(a = mean(a))
@@ -437,13 +625,11 @@ as.data.frame(d2)
 ```
 ##   g2        a
-## 1  1 3.666667
-## 2  2 2.000000
+## 1  1 2.666667
+## 2  2 3.500000
 ```
-Trying to write a function in R that will receive two argumens, the first a variable and
-the second an expression is not trivia. As shown by Hardley, one might expect this function
-to do the trick:
+As shown by Hardley, one might expect this function to do the trick:
 ```r
@@ -458,11 +644,13 @@ my_summarise <- function(df, group_var) {
 ```
 In order to solve this problem, coding with dplyr requires the introduction of many new concepts
-and functions such as 'quo', 'quos', 'enquo', 'enquos', '!!' (bang bang), '!!!' (triple bang).
+and functions such as 'quo', 'quos', 'enquo', 'enquos', '!!' (bang bang), '!!!' (triple bang).
+Again, we'll leave to Hardley the explanation on how to use all those functions.
 Now, let's try to implement the same function in galaaz.  The next code block first prints the
-'df' data frame define previously in R, then creates the my_summarize function and calls it
-passing the R data frame and the group by variable ':g1'
+'df' data frame define previously in R (to access an R variable from Galaaz, we use the tilda
+operator '~' applied to the R variable name as symbol, i.e., ':df'.  We then create the
+'my_summarize' method and call it passing the R data frame and the group by variable ':g1':
 ```ruby
@@ -471,35 +659,35 @@ print "\n"
 def my_summarize(df, group_var)
   df.group_by(group_var).
-    summarize(a: E.mean(:a))
+    summarize(a: :a.mean)
 end
-puts my_summarize((~:df), :g1).as__data__frame
+puts my_summarize(:df, :g1).as__data__frame
 ```
 ```
 ##   g1 g2 a b
-## 1  1  1 5 2
-## 2  1  2 1 5
-## 3  2  1 2 4
-## 4  2  2 3 1
-## 5  2  1 4 3
+## 1  1  1 2 1
+## 2  1  2 4 3
+## 3  2  1 5 4
+## 4  2  2 3 2
+## 5  2  1 1 5
 ##
 ##   g1 a
 ## 1  1 3
 ## 2  2 3
 ```
-It works!!! Well let's make sure this was not just some coincidence
+It works!!! Well, let's make sure this was not just some coincidence
 ```ruby
-puts my_summarize((~:df), :g2).as__data__frame
+puts my_summarize(:df, :g2).as__data__frame
 ```
 ```
 ##   g2        a
-## 1  1 3.666667
-## 2  2 2.000000
+## 1  1 2.666667
+## 2  2 3.500000
 ```
 Great, everything is fine! No magic, no new functions, no complexities, just normal, standard Ruby
@@ -508,7 +696,7 @@ code.  If you've ever done NSE in R, this certainly feels much safer and easy to
 # Different input variables
 In the previous section we've managed to get rid of all NSE formulation for a simple example, but
-does this remain true for more complex examples, or will the Ruby way prove inpractical for
+does this remain true for more complex examples, or will the Galaaz way prove inpractical for
 more complex code?
 In the next example Hardley proposes us to write a function that given an expression such as 'a'
@@ -526,7 +714,7 @@ summarise(df, mean = mean(a * b), sum = sum(a * b), n = n())
 #> # A tibble: 1 x 3
 #>    mean   sum     n
 #>   <dbl> <int> <int>
-#> 1   9.6    48     5
+#> 1   9    45     5
 ```
 Let's try it in galaaz:
@@ -549,11 +737,11 @@ puts my_summarise2((~:df), :a * :b)
 ##   mean sum n
 ## 1    3  15 5
 ##   mean sum n
-## 1  7.6  38 5
+## 1    9  45 5
 ```
 Once again, there is no need to use any special theory or functions.  The only point to be
-careful about is the use of 'E' to build an expression that uses the mean, sum and n.
+careful about is the use of 'E' to build expressions from functions 'mean', 'sum' and 'n'.
 # Different input and output variable
@@ -583,8 +771,10 @@ mutate(df, mean_b = mean(b), sum_b = sum(b))
 #> 4     2     2     5     4      3    15
 #> # … with 1 more row
 ```
+In order to solve this problem in R, Hardley needs to introduce some more new functions and notations:
+'quo_name' and the ':=' operator from package 'rlang'
-Here is our Ruby code
+Here is our Ruby code:
 ```ruby
@@ -602,17 +792,17 @@ puts my_mutate((~:df), :b)
 ```
 ##   g1 g2 a b mean_a sum_a
-## 1  1  1 5 2      3    15
-## 2  1  2 1 5      3    15
-## 3  2  1 2 4      3    15
-## 4  2  2 3 1      3    15
-## 5  2  1 4 3      3    15
+## 1  1  1 2 1      3    15
+## 2  1  2 4 3      3    15
+## 3  2  1 5 4      3    15
+## 4  2  2 3 2      3    15
+## 5  2  1 1 5      3    15
 ##   g1 g2 a b mean_b sum_b
-## 1  1  1 5 2      3    15
-## 2  1  2 1 5      3    15
-## 3  2  1 2 4      3    15
-## 4  2  2 3 1      3    15
-## 5  2  1 4 3      3    15
+## 1  1  1 2 1      3    15
+## 2  1  2 4 3      3    15
+## 3  2  1 5 4      3    15
+## 4  2  2 3 2      3    15
+## 5  2  1 1 5      3    15
 ```
 It really seems that "Non Standard Evaluation" is actually quite standard in Galaaz! But, you
 might have noticed a small change in the way the arguments to the mutate method were called.
@@ -624,6 +814,12 @@ and variable mean\_name is not followed by ':' but by '=>'.  This is standard Ru
 # Capturing multiple variables
+Moving on with new complexities, Hardley proposes us to solve the problem in which the
+summarise function will receive any number of grouping variables.
+This again is quite standard Ruby.  In order to receive an undefined number of paramenters
+the paramenter is preceded by '*':
 ```ruby
 def my_summarise3(df, *group_vars)
@@ -636,14 +832,58 @@ puts my_summarise3((~:df), :g1, :g2).as__data__frame
 ```
 ##   g1 g2 a
-## 1  1  1 5
-## 2  1  2 1
+## 1  1  1 2
+## 2  1  2 4
 ## 3  2  1 3
 ## 4  2  2 3
 ```
+# Why does R require NSE and Galaaz does not?
+NSE introduces a number of new concepts, such as 'quoting', 'quasiquotation', 'unquoting' and
+'unquote-splicing', while in Galaaz none of those concepts are needed. What gives?
+R is an extremely flexible language and it has lazy evaluation of parameters. When in R a
+function is called as 'summarise(df, a = b)', the summarise function receives the litteral
+'a = b' parameter and can work with this as if it were a string. In R, it is not clear what
+a and b are, they can be expressions or they can be variables, it is up to the function to
+decide what 'a = b' means.
+In Ruby, there is no lazy evaluation of parameters and 'a' is always a variable and so is 'b'.
+Variables assume their value as soon as they are used, so 'x = a' is immediately evaluate and
+variable 'x' will receive the value of variable 'a' as soon as the Ruby statement is executed.
+Ruby also provides the notion of a symbol; ':a' is a symbol and does not evaluate to anything.
+Galaaz uses Ruby symbols to build expressions that are not bound to anything: ':a.eq :b' is
+clearly an expression and has no relationship whatsoever with the statment 'a = b'. By using
+symbols, variables and expressions all the possible ambiguities that are found in R are
+eliminated in Galaaz.
+The main problem that remains, is that in R, functions are not clearly documented as what type
+of input they are expecting, they might be expecting regular variables or they might be
+expecting expressions and the R function will know how to deal with an input of the form
+'a = b', now for the Ruby developer it might not be immediately clear if it should call the
+function passing the value 'true' if variable 'a' is equal to variable 'b' or if it should
+call the function passing the expression ':a.eq :b'.
 # Advanced dplyr features
-https://www.r-bloggers.com/programming-with-dplyr-by-using-dplyr/
+In the blog: Programming with dplyr by using dplyr (https://www.r-bloggers.com/programming-with-dplyr-by-using-dplyr/) Iñaki Úcar shows surprise that some R users are trying to code in dplyr avoiding
+the use of NSE.  For instance he says:
+> Take the example of seplyr. It stands for standard evaluation dplyr, and enables us to
+> program over dplyr without having “to bring in (or study) any deep-theory or
+> heavy-weight tools such as rlang/tidyeval”.
+For me, there isn't really any surprise that users are trying to avoid dplyr deep-theory. R
+users frequently are not programmers and learning to code is already hard business, on top
+of that, having to learn how to 'quote' or 'enquo' or 'quos' or 'enquos' is not necessarily
+a 'piece of cake'. So much so, that 'tidyeval' has some more advanced functions that instead
+of using quoted expressions, uses strings as arguments.
+In the following examples, we show the use of functions 'group\_by\_at', 'summarise\_at' and
+'rename\_at' that receive strings as argument. The data frame used in 'starwars' that describes
+features of characters in the Starwars movies:
 ```ruby
@@ -680,6 +920,8 @@ puts (~:starwars).head.as__data__frame
 ## 5              Imperial Speeder Bike
 ## 6
 ```
+The grouped_mean function bellow will receive a grouping variable and calculate summaries for
+the value\_variables given:
 ```r
@@ -716,6 +958,8 @@ as.data.frame(gm)
 ## 15        yellow  81.11111        76.38000    11
 ```
+The same code with Galaaz, becomes:
 ```ruby
 def grouped_mean(data, grouping_variables, value_variables)
@@ -723,10 +967,10 @@ def grouped_mean(data, grouping_variables, value_variables)
     group_by_at(grouping_variables).
     mutate(count: E.n).
     summarise_at(E.c(value_variables, "count"), ~:mean, na__rm: true).
-    rename_at(value_variables, R.funs(E.paste0("mean_", value_variables)))
+    rename_at(value_variables, E.funs(E.paste0("mean_", value_variables)))
 end
-puts grouped_mean((~:starwars), "eye_color", R.c("mass", "birth_year")).as__data__frame
+puts grouped_mean((~:starwars), "eye_color", E.c("mass", "birth_year")).as__data__frame
 ```
 ```
@@ -747,3 +991,13 @@ puts grouped_mean((~:starwars), "eye_color", R.c("mass", "birth_year")).as__data
 ## 14         white  48.00000             NaN     1
 ## 15        yellow  81.11111        76.38000    11
 ```
+# Conclusion
+Ruby and Galaaz provide a nice framework for developing code that uses R functions. Although R is
+a very powerful and flexible language, sometimes, too much flexibility makes life harder for
+the casual user. We believe however, that even for the advanced user, Ruby integrated
+with R throught Galaaz, makes a powerful environment for data analysis.  In this blog post we
+showed how Galaaz consistent syntax eliminates the need for complex constructs such as quoting,
+enquoting, quasiquotation, etc. This simplification comes from the fact that expressions and
+variables are clearly separated objects, which is not the case in the R language.