RubyGems - galaaz - Versions diffs - 0.4.9 → 0.4.10 - Mend

galaaz 0.4.9 → 0.4.10

Files changed (76) hide show

checksums.yaml +4 -4
data/README.md +798 -285
data/blogs/galaaz_ggplot/galaaz_ggplot.Rmd +3 -12
data/blogs/galaaz_ggplot/galaaz_ggplot.aux +5 -7
data/blogs/galaaz_ggplot/galaaz_ggplot.html +69 -29
data/blogs/galaaz_ggplot/galaaz_ggplot.pdf +0 -0
data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-html/midwest_rb.png +0 -0
data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-html/scatter_plot_rb.png +0 -0
data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-latex/midwest_rb.pdf +0 -0
data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-latex/scatter_plot_rb.pdf +0 -0
data/blogs/galaaz_ggplot/midwest.Rmd +1 -9
data/blogs/gknit/gknit.Rmd +37 -40
data/blogs/gknit/gknit.html +32 -30
data/blogs/gknit/gknit.md +36 -37
data/blogs/gknit/gknit.pdf +0 -0
data/blogs/gknit/gknit.tex +35 -37
data/blogs/manual/manual.Rmd +548 -125
data/blogs/manual/manual.html +509 -286
data/blogs/manual/manual.md +798 -285
data/blogs/manual/manual.pdf +0 -0
data/blogs/manual/manual.tex +2816 -0
data/blogs/manual/manual_files/figure-latex/diverging_bar.pdf +0 -0
data/blogs/nse_dplyr/nse_dplyr.Rmd +240 -74
data/blogs/nse_dplyr/nse_dplyr.html +191 -87
data/blogs/nse_dplyr/nse_dplyr.md +361 -107
data/blogs/nse_dplyr/nse_dplyr.pdf +0 -0
data/blogs/nse_dplyr/nse_dplyr.tex +1373 -0
data/blogs/ruby_plot/ruby_plot.Rmd +61 -81
data/blogs/ruby_plot/ruby_plot.html +54 -57
data/blogs/ruby_plot/ruby_plot.md +48 -67
data/blogs/ruby_plot/ruby_plot.pdf +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/dose_len.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_delivery.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_dose.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color2.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_jitter.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_points.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/final_box_plot.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/final_violin_plot.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/violin_with_jitter.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/dose_len.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/facet_by_delivery.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/facet_by_dose.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_by_delivery_color.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_by_delivery_color2.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_with_decorations.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_with_jitter.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_with_points.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/final_box_plot.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/final_violin_plot.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-latex/violin_with_jitter.png +0 -0
data/lib/R_interface/rdata_frame.rb +0 -12
data/lib/R_interface/robject.rb +14 -14
data/lib/R_interface/ruby_extensions.rb +3 -31
data/lib/R_interface/rvector.rb +0 -12
data/lib/gknit/knitr_engine.rb +5 -3
data/lib/util/exec_ruby.rb +22 -61
data/specs/tmp.rb +26 -12
data/version.rb +1 -1
metadata +22 -17
data/bin/gknit_old_r +0 -236
data/blogs/dev/dev.Rmd +0 -23
data/blogs/dev/dev.md +0 -58
data/blogs/dev/dev2.Rmd +0 -65
data/blogs/dev/model.rb +0 -41
data/blogs/dplyr/dplyr.Rmd +0 -29
data/blogs/dplyr/dplyr.html +0 -433
data/blogs/dplyr/dplyr.md +0 -58
data/blogs/dplyr/dplyr.rb +0 -63
data/blogs/galaaz_ggplot/galaaz_ggplot.log +0 -640
data/blogs/galaaz_ggplot/galaaz_ggplot.md +0 -431
data/blogs/galaaz_ggplot/galaaz_ggplot.tex +0 -481
data/blogs/galaaz_ggplot/midwest.png +0 -0
data/blogs/galaaz_ggplot/scatter_plot.png +0 -0
data/blogs/ruby_plot/ruby_plot.tex +0 -1077

@@ -98,15 +98,13 @@ Panda, SciPy, SciKit-Learn and a couple more.
 # gKnitting a Document
 This manual has been formatted usign gKnit.  gKnit uses Knitr and R markdown to knit
-a document in Ruby or R and output it in any of the available formats for R markdown.
+a document in Ruby or R and output it in any of the available formats for R markdown.
 gKnit runs atop of GraalVM, and Galaaz.  In gKnit, Ruby variables are persisted between
-chunks, making it an ideal solution for literate programming.
-Also, since it is based on Galaaz, Ruby chunks can have access to R variables and Polyglot
-Programming with Ruby and R is quite natural.
+chunks, making it an ideal solution for literate programming. Also, since it is based
+on Galaaz, Ruby chunks can have access to R variables and Polyglot Programming with
+Ruby and R is quite natural.
-gknit was describe in more depth in:
-* xxx.xxxx.xxx
+[gknit is described in more details here](https://towardsdatascience.com/how-to-do-reproducible-research-in-ruby-with-gknit-c26d2684d64e)
 # Vector
@@ -134,15 +132,15 @@ To create a vector the 'c' (concatenate) method from the 'R' module should be us
 ```ruby
-@vec = R.c(1, 2, 3)
-puts @vec
+vec = R.c(1, 2, 3)
+puts vec
 ```
 ```
 ## [1] 1 2 3
 ```
-Lets take a look at the type, mode and storage.mode of our vector @vec.  In order to print
+Lets take a look at the type, mode and storage.mode of our vector vec.  In order to print
 this out, we are creating a data frame 'df' and printing it out.  A data frame, for those
 not familiar with it, is basically a table.  Here we create the data frame and add the
 column name by passing named parameters for each column, such as 'typeof:', 'mode:' and
@@ -154,7 +152,7 @@ data frame is 'data.frame', in Galaaz we use 'data\_\_frame'.
 ```ruby
-df = R.data__frame(typeof: @vec.typeof, mode: @vec.mode, storage__mode: @vec.storage__mode)
+df = R.data__frame(typeof: vec.typeof, mode: vec.mode, storage__mode: vec.storage__mode)
 puts df
 ```
@@ -170,8 +168,8 @@ follows normal Ruby rules and the number 1 is an integer and 1.0 is a float.
 ```ruby
-@vec = R.c(1.0, 2, 3)
-puts @vec
+vec = R.c(1.0, 2, 3)
+puts vec
 ```
 ```
@@ -180,7 +178,7 @@ puts @vec
 ```ruby
-df = R.data__frame(typeof: @vec.typeof, mode: @vec.mode, storage__mode: @vec.storage__mode)
+df = R.data__frame(typeof: vec.typeof, mode: vec.mode, storage__mode: vec.storage__mode)
 outputs df.kable.kable_styling
 ```
@@ -213,14 +211,14 @@ vec = R.c(1, hello, 5)
 ```
 ## Message:
-##  undefined local variable or method `hello' for RubyChunk:Class
+##  undefined local variable or method `hello' for #<RC:0x2e0 @out_list=nil>:RC
 ```
 ```
 ## Message:
-##  (eval):1:in `exec_ruby'
-## /home/rbotafogo/desenv/galaaz/lib/util/exec_ruby.rb:141:in `instance_eval'
-## /home/rbotafogo/desenv/galaaz/lib/util/exec_ruby.rb:141:in `exec_ruby'
+##  /home/rbotafogo/desenv/galaaz/lib/util/exec_ruby.rb:103:in `get_binding'
+## /home/rbotafogo/desenv/galaaz/lib/util/exec_ruby.rb:102:in `eval'
+## /home/rbotafogo/desenv/galaaz/lib/util/exec_ruby.rb:102:in `exec_ruby'
 ## /home/rbotafogo/desenv/galaaz/lib/gknit/knitr_engine.rb:650:in `block in initialize'
 ## /home/rbotafogo/desenv/galaaz/lib/R_interface/ruby_callback.rb:77:in `call'
 ## /home/rbotafogo/desenv/galaaz/lib/R_interface/ruby_callback.rb:77:in `callback'
@@ -245,8 +243,8 @@ Here is a vector with logical values
 ```ruby
-@vec = R.c(true, true, false, false, true)
-puts @vec
+vec = R.c(true, true, false, false, true)
+puts vec
 ```
 ```
@@ -259,26 +257,26 @@ The 'c' functions used to create vectors can also be used to combine two vectors
 ```ruby
-@vec1 = R.c(10.0, 20.0, 30.0)
-@vec2 = R.c(4.0, 5.0, 6.0)
-@vec = R.c(@vec1, @vec2)
-puts @vec
+vec1 = R.c(10.0, 20.0, 30.0)
+vec2 = R.c(4.0, 5.0, 6.0)
+vec = R.c(vec1, vec2)
+puts vec
 ```
 ```
 ## [1] 10 20 30  4  5  6
 ```
 In galaaz, methods can be chainned (somewhat like the pipe operator in R %>%, but more generic).
-In this next example, method 'c' is chainned after '@vec1'.  This also looks like 'c' is a
+In this next example, method 'c' is chainned after 'vec1'.  This also looks like 'c' is a
 method of the vector, but in reallity, this is actually closer to the pipe operator.  When
 Galaaz identifies that 'c' is not a method of 'vec' it actually tries to call 'R.c' with
-'@vec1' as the first argument concatenated with all the other available arguments.  The code
+'vec1' as the first argument concatenated with all the other available arguments.  The code
 bellow is automatically converted to the code above.
 ```ruby
-@vec = @vec1.c(@vec2)
-puts @vec
+vec = vec1.c(vec2)
+puts vec
 ```
 ```
@@ -291,7 +289,7 @@ Arithmetic operations on vectors are performed element by element:
 ```ruby
-puts @vec1 + @vec2
+puts vec1 + vec2
 ```
 ```
@@ -300,7 +298,7 @@ puts @vec1 + @vec2
 ```ruby
-puts @vec1 * 5
+puts vec1 * 5
 ```
 ```
@@ -311,8 +309,8 @@ When vectors have different length, a recycling rule is applied to the shorter v
 ```ruby
-@vec3 = R.c(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0)
-puts @vec4 = @vec1 + @vec3
+vec3 = R.c(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0)
+puts vec4 = vec1 + vec3
 ```
 ```
@@ -325,7 +323,7 @@ Vectors can be indexed by using the '[]' operator:
 ```ruby
-puts @vec4[3]
+puts vec4[3]
 ```
 ```
@@ -333,11 +331,11 @@ puts @vec4[3]
 ```
 We can also index a vector with another vector.  For example, in the code bellow, we take elements
-1, 3, 5, and 7 from @vec3:
+1, 3, 5, and 7 from vec3:
 ```ruby
-puts @vec4[R.c(1, 3, 5, 7)]
+puts vec4[R.c(1, 3, 5, 7)]
 ```
 ```
@@ -348,7 +346,7 @@ Repeating an index and having indices out of order is valid code:
 ```ruby
-puts @vec4[R.c(1, 3, 3, 1)]
+puts vec4[R.c(1, 3, 3, 1)]
 ```
 ```
@@ -360,8 +358,8 @@ the indexed values are not returned:
 ```ruby
-puts @vec4[-3]
-puts @vec4[-R.c(1, 3, 5, 7)]
+puts vec4[-3]
+puts vec4[-R.c(1, 3, 5, 7)]
 ```
 ```
@@ -373,7 +371,7 @@ If an index is out of range, a missing value (NA) will be reported.
 ```ruby
-puts @vec4[30]
+puts vec4[30]
 ```
 ```
@@ -384,7 +382,7 @@ It is also possible to index a vector by range:
 ```ruby
-puts @vec4[(2..5)]
+puts vec4[(2..5)]
 ```
 ```
@@ -427,9 +425,9 @@ from the vector. In order to do this extraction the '>>' operator is used.
 ```ruby
-puts @vec4
-puts @vec4 >> 0
-puts @vec4 >> 4
+puts vec4
+puts vec4 >> 0
+puts vec4 >> 4
 ```
 ```
@@ -929,11 +927,11 @@ created by the 'matrix' function:
 ```ruby
-@mat = R.matrix(R.c(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0),
-                nrow: 3,
-                ncol: 3)
+mat = R.matrix(R.c(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0),
+               nrow: 3,
+               ncol: 3)
-puts @mat
+puts mat
 ```
 ```
@@ -947,12 +945,12 @@ memory by row first passing an extra argument to the 'matrix' function:
 ```ruby
-@mat_row = R.matrix(R.c(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0),
-                nrow: 3,
-                ncol: 3,
-                byrow: true)
+mat_row = R.matrix(R.c(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0),
+                   nrow: 3,
+                   ncol: 3,
+                   byrow: true)
-puts @mat_row
+puts mat_row
 ```
 ```
@@ -968,8 +966,8 @@ A matrix can be indexed by [row, column]:
 ```ruby
-puts @mat_row[1, 1]
-puts @mat_row[2, 3]
+puts mat_row[1, 1]
+puts mat_row[2, 3]
 ```
 ```
@@ -980,8 +978,8 @@ It is possible to index an entire row or column with the ':all' keyword
 ```ruby
-puts @mat_row[1, :all]
-puts @mat_row[:all, 2]
+puts mat_row[1, :all]
+puts mat_row[:all, 2]
 ```
 ```
@@ -994,7 +992,7 @@ rows 1 and 3 and columns 2 and 3 building a 2 x 2 matrix.
 ```ruby
-puts @mat_row[R.c(1, 3), R.c(2, 3)]
+puts mat_row[R.c(1, 3), R.c(2, 3)]
 ```
 ```
@@ -1003,12 +1001,11 @@ puts @mat_row[R.c(1, 3), R.c(2, 3)]
 ## [2,]    8    9
 ```
-Matrices can be combined with functions 'rbind' and 'cbind'
+Matrices can be combined with functions 'rbind':
 ```ruby
-puts @mat_row.rbind(@mat)
-puts @mat_row.cbind(@mat)
+puts mat_row.rbind(mat)
 ```
 ```
@@ -1019,6 +1016,16 @@ puts @mat_row.cbind(@mat)
 ## [4,]    1    4    7
 ## [5,]    2    5    8
 ## [6,]    3    6    9
+```
+and 'cbind':
+```ruby
+puts mat_row.cbind(mat)
+```
+```
 ##      [,1] [,2] [,3] [,4] [,5] [,6]
 ## [1,]    1    2    3    1    4    7
 ## [2,]    4    5    6    2    5    8
@@ -1035,8 +1042,8 @@ can only hold one type of element.
 nums = R.c(1.0, 2.0, 3.0)
 strs = R.c("a", "b", "c", "d")
 bool = R.c(true, true, false)
-@lst = R.list(nums: nums, strs: strs, bool: bool)
-puts @lst
+lst = R.list(nums: nums, strs: strs, bool: bool)
+puts lst
 ```
 ```
@@ -1050,7 +1057,7 @@ puts @lst
 ## [1]  TRUE  TRUE FALSE
 ```
-Note that '@lst' elements are named elements.
+Note that 'lst' elements are named elements.
 ## List Indexing
@@ -1061,7 +1068,7 @@ return one of the sublists.
 ```ruby
-puts @lst[1]
+puts lst[1]
 ```
 ```
@@ -1076,18 +1083,18 @@ the original list
 ```ruby
-puts @lst[[1]]
+puts lst[[1]]
 ```
 ```
 ## [1] 1 2 3
 ```
-When elements are named, as dones with @lst, indexing can be done by name:
+When elements are named, as dones with lst, indexing can be done by name:
 ```ruby
-puts @lst[['bool']][[1]] >> 0
+puts lst[['bool']][[1]] >> 0
 ```
 ```
@@ -1207,23 +1214,31 @@ puts (~:mtcars)[R.c('Datsun 710', 'Camaro Z28'), :all]
 Finally, a data frame can also be indexed with a logical vector.  In this next example, the
 'am' column of :mtcars is compared with 0 (with method 'eq').  When 'am' is equal to 0 the
 car is automatic.  So, by doing '(~:mtcars).am.eq 0' a logical vector is created with
-'true' whenever 'am' is 0 and 'false' otherwise.  Using this logical vector, the data frame
-is indexed, returning a new data frame in which all cars have automatic transmission.
+'true' whenever 'am' is 0 and 'false' otherwise.
 ```ruby
 # obtain a vector with 'true' for cars with automatic transmission
 automatic = (~:mtcars).am.eq 0
 puts automatic
-# slice the data frame by using this vector
-puts (~:mtcars)[automatic, :all]
 ```
 ```
 ##  [1] FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
 ## [12]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE  TRUE
 ## [23]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
+```
+Using this logical vector, the data frame is indexed, returning a new data frame in
+which all cars have automatic transmission.
+```ruby
+# slice the data frame by using this vector
+puts (~:mtcars)[automatic, :all]
+```
+```
 ##                      mpg cyl  disp  hp drat    wt  qsec vs am gear carb
 ## Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
 ## Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
@@ -1366,6 +1381,62 @@ puts exp7
 ## y <- sin(x)
 ```
+Expressions can also be written using '.' notation:
+```ruby
+exp8 = :y.assign :x.sin
+puts exp8
+```
+```
+## y <- sin(x)
+```
+When a function has multiple arguments, the first one can be used before the '.':
+```ruby
+exp9 = :x.c(:y)
+puts exp9
+```
+```
+## c(x, y)
+```
+## Evaluating an Expression
+Expressions can be evaluated by calling function 'eval' with a binding. A binding can be provided
+with a list:
+```ruby
+exp = (:a + :b) * 2.0 + :c ** 2 / :z
+puts exp.eval(R.list(a: 10, b: 20, c: 30, z: 40))
+```
+```
+## [1] 82.5
+```
+... with a data frame:
+```ruby
+df = R.data__frame(
+  a: R.c(1, 2, 3),
+  b: R.c(10, 20, 30),
+  c: R.c(100, 200, 300),
+  z: R.c(1000, 2000, 3000))
+puts exp.eval(df)
+```
+```
+## [1] 32 64 96
+```
 # Manipulating Data
 One of the major benefits of Galaaz is to bring strong data manipulation to Ruby. The following
@@ -1389,8 +1460,8 @@ R.library('dplyr')
 ```ruby
-@flights = ~:flights
-puts @flights.head.as__data__frame
+flights = ~:flights
+puts flights.head.as__data__frame
 ```
 ```
@@ -1424,7 +1495,7 @@ the first :month.eq 1
 ```ruby
-puts @flights.filter((:month.eq 1), (:day.eq 1)).head.as__data__frame
+puts flights.filter((:month.eq 1), (:day.eq 1)).head.as__data__frame
 ```
 ```
@@ -1457,7 +1528,7 @@ All flights that departed in November of December
 ```ruby
-puts @flights.filter((:month.eq 11) | (:month.eq 12)).head.as__data__frame
+puts flights.filter((:month.eq 11) | (:month.eq 12)).head.as__data__frame
 ```
 ```
@@ -1491,7 +1562,7 @@ symbol, in this case ':in' and the second argument is the vector:
 ```ruby
-puts @flights.filter(:month._ :in, R.c(11, 12)).head.as__data__frame
+puts flights.filter(:month._ :in, R.c(11, 12)).head.as__data__frame
 ```
 ```
@@ -1527,8 +1598,8 @@ what is obtained from data frame.
 ```ruby
-@df = R.tibble(x: R.c(1, R::NA, 3))
-puts @df.as__data__frame
+df = R.tibble(x: R.c(1, R::NA, 3))
+puts df.as__data__frame
 ```
 ```
@@ -1543,7 +1614,7 @@ not.
 ```ruby
-puts @df.filter(:x > 1).as__data__frame
+puts df.filter(:x > 1).as__data__frame
 ```
 ```
@@ -1555,7 +1626,7 @@ To match an NA use method 'is__na'
 ```ruby
-puts @df.filter((:x.is__na) | (:x > 1)).as__data__frame
+puts df.filter((:x.is__na) | (:x > 1)).as__data__frame
 ```
 ```
@@ -1570,7 +1641,7 @@ Arrange reorders the rows of a data frame by the given arguments.
 ```ruby
-puts @flights.arrange(:year, :month, :day).head.as__data__frame
+puts flights.arrange(:year, :month, :day).head.as__data__frame
 ```
 ```
@@ -1601,7 +1672,7 @@ To arrange in descending order, use function 'desc'
 ```ruby
-puts @flights.arrange(:dep_delay.desc).head.as__data__frame
+puts flights.arrange(:dep_delay.desc).head.as__data__frame
 ```
 ```
@@ -1634,7 +1705,7 @@ To select specific columns from a dataset we use function 'select':
 ```ruby
-puts @flights.select(:year, :month, :day).head.as__data__frame
+puts flights.select(:year, :month, :day).head.as__data__frame
 ```
 ```
@@ -1651,7 +1722,7 @@ It is also possible to select column in a given range
 ```ruby
-puts @flights.select(:year.up_to :day).head.as__data__frame
+puts flights.select(:year.up_to :day).head.as__data__frame
 ```
 ```
@@ -1668,7 +1739,7 @@ Select all columns that start with a given name sequence
 ```ruby
-puts @flights.select(E.starts_with('arr')).head.as__data__frame
+puts flights.select(E.starts_with('arr')).head.as__data__frame
 ```
 ```
@@ -1696,7 +1767,7 @@ A helper function that comes in handy when we just want to rearrange column orde
 ```ruby
-puts @flights.select(:year, :month, :day, E.everything).head.as__data__frame
+puts flights.select(:year, :month, :day, E.everything).head.as__data__frame
 ```
 ```
@@ -1727,13 +1798,13 @@ puts @flights.select(:year, :month, :day, E.everything).head.as__data__frame
 ```ruby
-@flights_sm = @flights.
-                select((:year.up_to :day),
-                       E.ends_with('delay'),
-                       :distance,
-                       :air_time)
+flights_sm = flights.
+               select((:year.up_to :day),
+                      E.ends_with('delay'),
+                      :distance,
+                      :air_time)
-puts @flights_sm.head.as__data__frame
+puts flights_sm.head.as__data__frame
 ```
 ```
@@ -1748,10 +1819,10 @@ puts @flights_sm.head.as__data__frame
 ```ruby
-@flights_sm = @flights_sm.
-                mutate(gain: :dep_delay - :arr_delay,
-                       speed: :distance / :air_time * 60)
-puts @flights_sm.head.as__data__frame
+flights_sm = flights_sm.
+               mutate(gain: :dep_delay - :arr_delay,
+                      speed: :distance / :air_time * 60)
+puts flights_sm.head.as__data__frame
 ```
 ```
@@ -1771,7 +1842,7 @@ a single value is obtained from the data frame:
 ```ruby
-puts @flights.summarise(delay: E.mean(:dep_delay, na__rm: true)).as__data__frame
+puts flights.summarise(delay: E.mean(:dep_delay, na__rm: true)).as__data__frame
 ```
 ```
@@ -1783,7 +1854,7 @@ When a data frame is groupe with 'group_by' summaries apply to the given group:
 ```ruby
-by_day = @flights.group_by(:year, :month, :day)
+by_day = flights.group_by(:year, :month, :day)
 puts by_day.summarise(delay: :dep_delay.mean(na__rm: true)).head.as__data__frame
 ```
@@ -1801,7 +1872,7 @@ Next we put many operations together by pipping them one after the other:
 ```ruby
-delays = @flights.
+delays = flights.
            group_by(:dest).
            summarise(
              count: E.n,
@@ -1809,108 +1880,17 @@ delays = @flights.
              delay: :arr_delay.mean(na__rm: true)).
            filter(:count > 20, :dest != "NHL")
-puts delays.as__data__frame
-```
-```
-##    dest count       dist       delay
-## 1   ABQ   254 1826.00000  4.38188976
-## 2   ACK   265  199.00000  4.85227273
-## 3   ALB   439  143.00000 14.39712919
-## 4   ATL 17215  757.10822 11.30011285
-## 5   AUS  2439 1514.25297  6.01990875
-## 6   AVL   275  583.58182  8.00383142
-## 7   BDL   443  116.00000  7.04854369
-## 8   BGR   375  378.00000  8.02793296
-## 9   BHM   297  865.99663 16.87732342
-## 10  BNA  6333  758.21348 11.81245891
-## 11  BOS 15508  190.63696  2.91439222
-## 12  BQN   896 1578.98326  8.24549550
-## 13  BTV  2589  265.09154  8.95099602
-## 14  BUF  4681  296.80837  8.94595186
-## 15  BUR   371 2465.00000  8.17567568
-## 16  BWI  1781  179.41830 10.72673385
-## 17  BZN    36 1882.00000  7.60000000
-## 18  CAE   116  603.55172 41.76415094
-## 19  CAK   864  397.00000 19.69833729
-## 20  CHO    52  305.00000  9.50000000
-## 21  CHS  2884  632.91678 10.59296847
-## 22  CLE  4573  414.17428  9.18161129
-## 23  CLT 14064  538.02730  7.36031885
-## 24  CMH  3524  476.55505 10.60132291
-## 25  CRW   138  444.00000 14.67164179
-## 26  CVG  3941  575.15986 15.36456376
-## 27  DAY  1525  537.10230 12.68048606
-## 28  DCA  9705  211.00618  9.06695204
-## 29  DEN  7266 1614.67836  8.60650021
-## 30  DFW  8738 1383.04303  0.32212685
-## 31  DSM   569 1020.88752 19.00573614
-## 32  DTW  9384  498.12852  5.42996346
-## 33  EGE   213 1735.70892  6.30434783
-## 34  FLL 12055 1070.06877  8.08212154
-## 35  GRR   765  605.78170 18.18956044
-## 36  GSO  1606  449.84184 14.11260054
-## 37  GSP   849  595.95995 15.93544304
-## 38  HNL   707 4972.67468 -1.36519258
-## 39  HOU  2115 1420.15508  7.17618819
-## 40  IAD  5700  224.84684 13.86420212
-## 41  IAH  7198 1407.20672  4.24079040
-## 42  ILM   110  500.00000  4.63551402
-## 43  IND  2077  652.26288  9.94043412
-## 44  JAC    25 1875.60000 28.09523810
-## 45  JAX  2720  824.67610 11.84483416
-## 46  LAS  5997 2240.96148  0.25772849
-## 47  LAX 16174 2468.62236  0.54711094
-## 48  LGB   668 2465.00000 -0.06202723
-## 49  MCI  2008 1097.69522 14.51405836
-## 50  MCO 14082  943.11057  5.45464309
-## 51  MDW  4113  718.04595 12.36422360
-## 52  MEM  1789  954.20123 10.64531435
-## 53  MHT  1009  207.02973 14.78755365
-## 54  MIA 11728 1091.55244  0.29905978
-## 55  MKE  2802  733.38151 14.16722038
-## 56  MSN   572  803.95455 20.19604317
-## 57  MSP  7185 1017.40167  7.27016886
-## 58  MSY  3799 1177.70571  6.49017497
-## 59  MVY   221  173.00000 -0.28571429
-## 60  MYR    59  550.66102  4.60344828
-## 61  OAK   312 2576.00000  3.07766990
-## 62  OKC   346 1325.00000 30.61904762
-## 63  OMA   849 1135.56655 14.69889841
-## 64  ORD 17283  729.00081  5.87661475
-## 65  ORF  1536  288.52344 10.94909344
-## 66  PBI  6554 1028.83811  8.56297210
-## 67  PDX  1354 2445.56573  5.14157973
-## 68  PHL  1632   94.32353 10.12719014
-## 69  PHX  4656 2141.30326  2.09704733
-## 70  PIT  2875  334.06122  7.68099053
-## 71  PSE   365 1617.00000  7.87150838
-## 72  PVD   376  160.00000 16.23463687
-## 73  PWM  2352  276.12840 11.66040210
-## 74  RDU  8163  426.75769 10.05238095
-## 75  RIC  2454  281.40465 20.11125320
-## 76  ROC  2416  259.25083 11.56064461
-## 77  RSW  3537 1072.85327  3.23814963
-## 78  SAN  2737 2437.29923  3.13916574
-## 79  SAT   686 1578.34111  6.94537178
-## 80  SAV   804  709.18408 15.12950601
-## 81  SDF  1157  645.98358 12.66938406
-## 82  SEA  3923 2412.66531 -1.09909910
-## 83  SFO 13331 2577.92356  2.67289152
-## 84  SJC   329 2569.00000  3.44817073
-## 85  SJU  5819 1599.83365  2.52052659
-## 86  SLC  2467 1986.98662  0.17625459
-## 87  SMF   284 2521.00000 12.10992908
-## 88  SNA   825 2434.00000 -7.86822660
-## 89  SRQ  1211 1044.65153  3.08243131
-## 90  STL  4339  878.72321 11.07846451
-## 91  STT   522 1626.98276 -3.83590734
-## 92  SYR  1761  205.92164  8.90392501
-## 93  TPA  7466 1003.93557  7.40852503
-## 94  TUL   315 1215.00000 33.65986395
-## 95  TVC   101  652.38614 12.96842105
-## 96  TYS   631  638.80983 24.06920415
-## 97  XNA  1036 1142.50579  7.46572581
+puts delays.as__data__frame.head
+```
+```
+##   dest count      dist     delay
+## 1  ABQ   254 1826.0000  4.381890
+## 2  ACK   265  199.0000  4.852273
+## 3  ALB   439  143.0000 14.397129
+## 4  ATL 17215  757.1082 11.300113
+## 5  AUS  2439 1514.2530  6.019909
+## 6  AVL   275  583.5818  8.003831
 ```
 # Using Data Table
@@ -1921,9 +1901,9 @@ R.library('data.table')
 R.install_and_loads('curl')
 input = "https://raw.githubusercontent.com/Rdatatable/data.table/master/vignettes/flights14.csv"
-@flights = R.fread(input)
-puts @flights
-puts @flights.dim
+flights = R.fread(input)
+puts flights
+puts flights.dim
 ```
 ```
@@ -1982,17 +1962,17 @@ puts data_table.ID
 ```ruby
 # subset rows in i
-ans = @flights[(:origin.eq "JFK") & (:month.eq 6)]
+ans = flights[(:origin.eq "JFK") & (:month.eq 6)]
 puts ans.head
 # Get the first two rows from flights.
-ans = @flights[(1..2)]
+ans = flights[(1..2)]
 puts ans
 # Sort flights first by column origin in ascending order, and then by dest in descending order:
-# ans = @flights[E.order(:origin, -(:dest))]
+# ans = flights[E.order(:origin, -(:dest))]
 # puts ans.head
 ```
@@ -2024,15 +2004,15 @@ puts ans
 # Select column(s) in j
 # select arr_delay column, but return it as a vector.
-ans = @flights[:all, :arr_delay]
+ans = flights[:all, :arr_delay]
 puts ans.head
 # Select arr_delay column, but return as a data.table instead.
-ans = @flights[:all, :arr_delay.list]
+ans = flights[:all, :arr_delay.list]
 puts ans.head
-ans = @flights[:all, E.list(:arr_delay, :dep_delay)]
+ans = flights[:all, E.list(:arr_delay, :dep_delay)]
 ```
 ```
@@ -2057,68 +2037,42 @@ the data frame with the necessary data:
 ```ruby
 # copy the R variable :mtcars to the Ruby mtcars variable
-@mtcars = ~:mtcars
+mtcars = ~:mtcars
 # create a new column 'car_name' to store the car names so that it can be
 # used for plotting. The 'rownames' of the data frame cannot be used as
 # data for plotting
-@mtcars.car_name = R.rownames(:mtcars)
+mtcars.car_name = R.rownames(:mtcars)
 # compute normalized mpg and add it to a new column called mpg_z
 # Note that the mean value for mpg can be obtained by calling the 'mean'
 # function on the vector 'mtcars.mpg'.  The same with the standard
 # deviation 'sd'.  The vector is then rounded to two digits with 'round 2'
-@mtcars.mpg_z = ((@mtcars.mpg - @mtcars.mpg.mean)/@mtcars.mpg.sd).round 2
+mtcars.mpg_z = ((mtcars.mpg - mtcars.mpg.mean)/mtcars.mpg.sd).round 2
 # create a new column 'mpg_type'. Function 'ifelse' is a vectorized function
 # that looks at every element of the mpg_z vector and if the value is below
 # 0, returns 'below', otherwise returns 'above'
-@mtcars.mpg_type = (@mtcars.mpg_z < 0).ifelse("below", "above")
+mtcars.mpg_type = (mtcars.mpg_z < 0).ifelse("below", "above")
 # order the mtcar data set by the mpg_z vector from smaler to larger values
-@mtcars = @mtcars[@mtcars.mpg_z.order, :all]
+mtcars = mtcars[mtcars.mpg_z.order, :all]
 # convert the car_name column to a factor to retain sorted order in plot
-@mtcars.car_name = @mtcars.car_name.factor levels: @mtcars.car_name
+mtcars.car_name = mtcars.car_name.factor levels: mtcars.car_name
 # let's look at the final data frame
-puts @mtcars
+puts mtcars.head
 ```
 ```
-##                      mpg cyl  disp  hp drat    wt  qsec vs am gear carb
-## Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
-## Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
-## Camaro Z28          13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
-## Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
-## Chrysler Imperial   14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
-## Maserati Bora       15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
-## Merc 450SLC         15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
-## AMC Javelin         15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
-## Dodge Challenger    15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
-## Ford Pantera L      15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
-## Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
-## Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
-## Merc 280C           17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
-## Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
-## Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
-## Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
-## Pontiac Firebird    19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
-## Ferrari Dino        19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
-## Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
-## Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
-## Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
-## Volvo 142E          21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2
-## Toyota Corona       21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
-## Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
-## Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
-## Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
-## Porsche 914-2       26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
-## Fiat X1-9           27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
-## Honda Civic         30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
-## Lotus Europa        30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
-## Fiat 128            32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
-## Toyota Corolla      33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
+##                      mpg cyl disp  hp drat    wt  qsec vs am gear carb
+## Cadillac Fleetwood  10.4   8  472 205 2.93 5.250 17.98  0  0    3    4
+## Lincoln Continental 10.4   8  460 215 3.00 5.424 17.82  0  0    3    4
+## Camaro Z28          13.3   8  350 245 3.73 3.840 15.41  0  0    3    4
+## Duster 360          14.3   8  360 245 3.21 3.570 15.84  0  0    3    4
+## Chrysler Imperial   14.7   8  440 230 3.23 5.345 17.42  0  0    3    4
+## Maserati Bora       15.0   8  301 335 3.54 3.570 14.60  0  1    5    8
 ##                                car_name mpg_z mpg_type
 ## Cadillac Fleetwood   Cadillac Fleetwood -1.61    below
 ## Lincoln Continental Lincoln Continental -1.61    below
@@ -2126,32 +2080,6 @@ puts @mtcars
 ## Duster 360                   Duster 360 -0.96    below
 ## Chrysler Imperial     Chrysler Imperial -0.89    below
 ## Maserati Bora             Maserati Bora -0.84    below
-## Merc 450SLC                 Merc 450SLC -0.81    below
-## AMC Javelin                 AMC Javelin -0.81    below
-## Dodge Challenger       Dodge Challenger -0.76    below
-## Ford Pantera L           Ford Pantera L -0.71    below
-## Merc 450SE                   Merc 450SE -0.61    below
-## Merc 450SL                   Merc 450SL -0.46    below
-## Merc 280C                     Merc 280C -0.38    below
-## Valiant                         Valiant -0.33    below
-## Hornet Sportabout     Hornet Sportabout -0.23    below
-## Merc 280                       Merc 280 -0.15    below
-## Pontiac Firebird       Pontiac Firebird -0.15    below
-## Ferrari Dino               Ferrari Dino -0.06    below
-## Mazda RX4                     Mazda RX4  0.15    above
-## Mazda RX4 Wag             Mazda RX4 Wag  0.15    above
-## Hornet 4 Drive           Hornet 4 Drive  0.22    above
-## Volvo 142E                   Volvo 142E  0.22    above
-## Toyota Corona             Toyota Corona  0.23    above
-## Datsun 710                   Datsun 710  0.45    above
-## Merc 230                       Merc 230  0.45    above
-## Merc 240D                     Merc 240D  0.72    above
-## Porsche 914-2             Porsche 914-2  0.98    above
-## Fiat X1-9                     Fiat X1-9  1.20    above
-## Honda Civic                 Honda Civic  1.71    above
-## Lotus Europa               Lotus Europa  1.71    above
-## Fiat 128                       Fiat 128  2.04    above
-## Toyota Corolla           Toyota Corolla  2.29    above
 ```
 Now, lets plot the diverging bar plot.  When using gKnit, there is no need to call
 'R.awt' to create a plotting device, since gKnit does take care of it. Galaaz
@@ -2173,19 +2101,604 @@ but in this graph we want the bars to be horizontally layed so we add 'coord\_fl
 ```ruby
 require 'ggplot'
-puts @mtcars.ggplot(E.aes(x: :car_name, y: :mpg_z, label: :mpg_z)) +
-  R.geom_bar(E.aes(fill: :mpg_type), stat: 'identity', width: 0.5) +
-  R.scale_fill_manual(name: 'Mileage',
-                      labels: R.c('Above Average', 'Below Average'),
-                      values: R.c('above': '#00ba38', 'below': '#f8766d')) +
-  R.labs(subtitle: "Normalised mileage from 'mtcars'",
-         title: "Diverging Bars") +
-  R.coord_flip
+puts mtcars.ggplot(E.aes(x: :car_name, y: :mpg_z, label: :mpg_z)) +
+     R.geom_bar(E.aes(fill: :mpg_type), stat: 'identity', width: 0.5) +
+     R.scale_fill_manual(name: 'Mileage',
+                         labels: R.c('Above Average', 'Below Average'),
+                         values: R.c('above': '#00ba38', 'below': '#f8766d')) +
+     R.labs(subtitle: "Normalised mileage from 'mtcars'",
+            title: "Diverging Bars") +
+     R.coord_flip
 ```
 ![](/home/rbotafogo/desenv/galaaz/blogs/manual/manual_files/figure-html/diverging_bar.png)<!-- -->
+# Coding with Tidyverse
+In R, and when coding with 'tidyverse', arguments to a function are usually not
+*referencially transparent*. That is, you can’t replace a value with a seemingly equivalent
+object that you’ve defined elsewhere. To see the problem, let's first define a data frame:
+```ruby
+df = R.data__frame(x: (1..3), y: (3..1))
+puts df
+```
+```
+##   x y
+## 1 1 3
+## 2 2 2
+## 3 3 1
+```
+and now, let's look at this code:
+```r
+my_var <- x
+filter(df, my_var == 1)
+```
+It generates the following error: "object 'x' not found.
+However, in Galaaz, arguments are referencially transparent as can be seen by the
+code bellow.  Note initally that 'my_var = :x' will not give the error "object 'x' not found"
+since ':x' is treated as an expression and assigned to my\_var. Then when doing (my\_var.eq 1),
+my\_var is a variable that resolves to ':x' and it becomes equivalent to (:x.eq 1) which is
+what we want.
+```ruby
+my_var = :x
+puts df.filter(my_var.eq 1)
+```
+```
+##   x y
+## 1 1 3
+```
+As stated by Hardley
+> dplyr code is ambiguous. Depending on what variables are defined where,
+> filter(df, x == y) could be equivalent to any of:
+```
+df[df$x == df$y, ]
+df[df$x == y, ]
+df[x == df$y, ]
+df[x == y, ]
+```
+In galaaz this ambiguity does not exist, filter(df, x.eq y) is not a valid expression as
+expressions are build with symbols.  In doing filter(df, :x.eq y) we are looking for elements
+of the 'x' column that are equal to a previously defined y variable.  Finally in
+filter(df, :x.eq :y) we are looking for elements in which the 'x' column value is equal to
+the 'y' column value. This can be seen in the following two chunks of code:
+```ruby
+y = 1
+x = 2
+# looking for values where the 'x' column is equal to the 'y' column
+puts df.filter(:x.eq :y)
+```
+```
+##   x y
+## 1 2 2
+```
+```ruby
+# looking for values where the 'x' column is equal to the 'y' variable
+# in this case, the number 1
+puts df.filter(:x.eq y)
+```
+```
+##   x y
+## 1 1 3
+```
+## Writing a function that applies to different data sets
+Let's suppose that we want to write a function that receives as the first argument a data frame
+and as second argument an expression that adds a column to the data frame that is equal to the
+sum of elements in column 'a' plus 'x'.
+Here is the intended behaviour using the 'mutate' function of 'dplyr':
+```
+mutate(df1, y = a + x)
+mutate(df2, y = a + x)
+mutate(df3, y = a + x)
+mutate(df4, y = a + x)
+```
+The naive approach to writing an R function to solve this problem is:
+```
+mutate_y <- function(df) {
+  mutate(df, y = a + x)
+}
+```
+Unfortunately, in R, this function can fail silently if one of the variables isn’t present
+in the data frame, but is present in the global environment.  We will not go through here how
+to solve this problem in R.
+In Galaaz the method mutate_y bellow will work fine and will never fail silently.
+```ruby
+def mutate_y(df)
+  df.mutate(:y.assign :a + :x)
+end
+```
+Here we create a data frame that has only one column named 'x':
+```ruby
+df1 = R.data__frame(x: (1..3))
+puts df1
+```
+```
+##   x
+## 1 1
+## 2 2
+## 3 3
+```
+Note that method mutate_y will fail independetly from the fact that variable 'a' is defined and
+in the scope of the method.  Variable 'a' has no relationship with the symbol ':a' used in the
+definition of 'mutate\_y' above:
+```ruby
+a = 10
+mutate_y(df1)
+```
+```
+## Message:
+##  Error in mutate_impl(.data, dots) :
+##   Evaluation error: object 'a' not found.
+## In addition: Warning message:
+## In mutate_impl(.data, dots) :
+##   mismatched protect/unprotect (unprotect with empty protect stack) (RError)
+## Translated to internal error
+```
+## Different expressions
+Let's move to the next problem as presented by Hardley where trying to write a function in R
+that will receive two argumens, the first a variable and the second an expression is not trivial.
+Bellow we create a data frame and we want to write a function that groups data by a variable and
+summarises it by an expression:
+```r
+set.seed(123)
+df <- data.frame(
+  g1 = c(1, 1, 2, 2, 2),
+  g2 = c(1, 2, 1, 2, 1),
+  a = sample(5),
+  b = sample(5)
+)
+as.data.frame(df)
+```
+```
+##   g1 g2 a b
+## 1  1  1 2 1
+## 2  1  2 4 3
+## 3  2  1 5 4
+## 4  2  2 3 2
+## 5  2  1 1 5
+```
+```r
+d2 <- df %>%
+  group_by(g1) %>%
+  summarise(a = mean(a))
+as.data.frame(d2)
+```
+```
+##   g1 a
+## 1  1 3
+## 2  2 3
+```
+```r
+d2 <- df %>%
+  group_by(g2) %>%
+  summarise(a = mean(a))
+as.data.frame(d2)
+```
+```
+##   g2        a
+## 1  1 2.666667
+## 2  2 3.500000
+```
+As shown by Hardley, one might expect this function to do the trick:
+```r
+my_summarise <- function(df, group_var) {
+  df %>%
+    group_by(group_var) %>%
+    summarise(a = mean(a))
+}
+# my_summarise(df, g1)
+#> Error: Column `group_var` is unknown
+```
+In order to solve this problem, coding with dplyr requires the introduction of many new concepts
+and functions such as 'quo', 'quos', 'enquo', 'enquos', '!!' (bang bang), '!!!' (triple bang).
+Again, we'll leave to Hardley the explanation on how to use all those functions.
+Now, let's try to implement the same function in galaaz.  The next code block first prints the
+'df' data frame defined previously in R (to access an R variable from Galaaz, we use the tilda
+operator '~' applied to the R variable name as symbol, i.e., ':df'.
+```ruby
+puts ~:df
+```
+```
+##   g1 g2 a b
+## 1  1  1 2 1
+## 2  1  2 4 3
+## 3  2  1 5 4
+## 4  2  2 3 2
+## 5  2  1 1 5
+```
+We then create the 'my_summarize' method and call it passing the R data frame and
+the group by variable ':g1':
+```ruby
+def my_summarize(df, group_var)
+  df.group_by(group_var).
+    summarize(a: :a.mean)
+end
+puts my_summarize(:df, :g1).as__data__frame
+```
+```
+##   g1 a
+## 1  1 3
+## 2  2 3
+```
+It works!!! Well, let's make sure this was not just some coincidence
+```ruby
+puts my_summarize(:df, :g2).as__data__frame
+```
+```
+##   g2        a
+## 1  1 2.666667
+## 2  2 3.500000
+```
+Great, everything is fine! No magic, no new functions, no complexities, just normal, standard Ruby
+code.  If you've ever done NSE in R, this certainly feels much safer and easy to implement.
+## Different input variables
+In the previous section we've managed to get rid of all NSE formulation for a simple example, but
+does this remain true for more complex examples, or will the Galaaz way prove inpractical for
+more complex code?
+In the next example Hardley proposes us to write a function that given an expression such as 'a'
+or 'a * b', calculates three summaries.  What we want a function that does the same as these R
+statements:
+```
+summarise(df, mean = mean(a), sum = sum(a), n = n())
+#> # A tibble: 1 x 3
+#>    mean   sum     n
+#>   <dbl> <int> <int>
+#> 1     3    15     5
+summarise(df, mean = mean(a * b), sum = sum(a * b), n = n())
+#> # A tibble: 1 x 3
+#>    mean   sum     n
+#>   <dbl> <int> <int>
+#> 1   9    45     5
+```
+Let's try it in galaaz:
+```ruby
+def my_summarise2(df, expr)
+  df.summarize(
+    mean: E.mean(expr),
+    sum: E.sum(expr),
+    n: E.n
+  )
+end
+puts my_summarise2((~:df), :a)
+puts "\n"
+puts my_summarise2((~:df), :a * :b)
+```
+```
+##   mean sum n
+## 1    3  15 5
+##
+##   mean sum n
+## 1    9  45 5
+```
+Once again, there is no need to use any special theory or functions.  The only point to be
+careful about is the use of 'E' to build expressions from functions 'mean', 'sum' and 'n'.
+## Different input and output variable
+Now the next challenge presented by Hardley is to vary the name of the output variables based on
+the received expression.  So, if the input expression is 'a', we want our data frame columns to
+be named 'mean\_a' and 'sum\_a'.  Now, if the input expression is 'b', columns
+should be named 'mean\_b' and 'sum\_b'.
+```
+mutate(df, mean_a = mean(a), sum_a = sum(a))
+#> # A tibble: 5 x 6
+#>      g1    g2     a     b mean_a sum_a
+#>   <dbl> <dbl> <int> <int>  <dbl> <int>
+#> 1     1     1     1     3      3    15
+#> 2     1     2     4     2      3    15
+#> 3     2     1     2     1      3    15
+#> 4     2     2     5     4      3    15
+#> # … with 1 more row
+mutate(df, mean_b = mean(b), sum_b = sum(b))
+#> # A tibble: 5 x 6
+#>      g1    g2     a     b mean_b sum_b
+#>   <dbl> <dbl> <int> <int>  <dbl> <int>
+#> 1     1     1     1     3      3    15
+#> 2     1     2     4     2      3    15
+#> 3     2     1     2     1      3    15
+#> 4     2     2     5     4      3    15
+#> # … with 1 more row
+```
+In order to solve this problem in R, Hardley needs to introduce some more new functions and notations:
+'quo_name' and the ':=' operator from package 'rlang'
+Here is our Ruby code:
+```ruby
+def my_mutate(df, expr)
+  mean_name = "mean_#{expr.to_s}"
+  sum_name = "sum_#{expr.to_s}"
+  df.mutate(mean_name => E.mean(expr),
+            sum_name => E.sum(expr))
+end
+puts my_mutate((~:df), :a)
+puts "\n"
+puts my_mutate((~:df), :b)
+```
+```
+##   g1 g2 a b mean_a sum_a
+## 1  1  1 2 1      3    15
+## 2  1  2 4 3      3    15
+## 3  2  1 5 4      3    15
+## 4  2  2 3 2      3    15
+## 5  2  1 1 5      3    15
+##
+##   g1 g2 a b mean_b sum_b
+## 1  1  1 2 1      3    15
+## 2  1  2 4 3      3    15
+## 3  2  1 5 4      3    15
+## 4  2  2 3 2      3    15
+## 5  2  1 1 5      3    15
+```
+It really seems that "Non Standard Evaluation" is actually quite standard in Galaaz! But, you
+might have noticed a small change in the way the arguments to the mutate method were called.
+In a previous example we used df.summarise(mean: E.mean(:a), ...) where the column name was
+followed by a ':' colom.  In this example, we have df.mutate(mean_name => E.mean(expr), ...)
+and variable mean\_name is not followed by ':' but by '=>'.  This is standard Ruby notation.
+[explain....]
+## Capturing multiple variables
+Moving on with new complexities, Hardley proposes us to solve the problem in which the
+summarise function will receive any number of grouping variables.
+This again is quite standard Ruby.  In order to receive an undefined number of paramenters
+the paramenter is preceded by '*':
+```ruby
+def my_summarise3(df, *group_vars)
+  df.group_by(*group_vars).
+    summarise(a: E.mean(:a))
+end
+puts my_summarise3((~:df), :g1, :g2).as__data__frame
+```
+```
+##   g1 g2 a
+## 1  1  1 2
+## 2  1  2 4
+## 3  2  1 3
+## 4  2  2 3
+```
+## Why does R require NSE and Galaaz does not?
+NSE introduces a number of new concepts, such as 'quoting', 'quasiquotation', 'unquoting' and
+'unquote-splicing', while in Galaaz none of those concepts are needed. What gives?
+R is an extremely flexible language and it has lazy evaluation of parameters. When in R a
+function is called as 'summarise(df, a = b)', the summarise function receives the litteral
+'a = b' parameter and can work with this as if it were a string. In R, it is not clear what
+a and b are, they can be expressions or they can be variables, it is up to the function to
+decide what 'a = b' means.
+In Ruby, there is no lazy evaluation of parameters and 'a' is always a variable and so is 'b'.
+Variables assume their value as soon as they are used, so 'x = a' is immediately evaluate and
+variable 'x' will receive the value of variable 'a' as soon as the Ruby statement is executed.
+Ruby also provides the notion of a symbol; ':a' is a symbol and does not evaluate to anything.
+Galaaz uses Ruby symbols to build expressions that are not bound to anything: ':a.eq :b' is
+clearly an expression and has no relationship whatsoever with the statment 'a = b'. By using
+symbols, variables and expressions all the possible ambiguities that are found in R are
+eliminated in Galaaz.
+The main problem that remains, is that in R, functions are not clearly documented as what type
+of input they are expecting, they might be expecting regular variables or they might be
+expecting expressions and the R function will know how to deal with an input of the form
+'a = b', now for the Ruby developer it might not be immediately clear if it should call the
+function passing the value 'true' if variable 'a' is equal to variable 'b' or if it should
+call the function passing the expression ':a.eq :b'.
+## Advanced dplyr features
+In the blog: Programming with dplyr by using dplyr (https://www.r-bloggers.com/programming-with-dplyr-by-using-dplyr/) Iñaki Úcar shows surprise that some R users are trying to code in dplyr avoiding
+the use of NSE.  For instance he says:
+> Take the example of seplyr. It stands for standard evaluation dplyr, and enables us to
+> program over dplyr without having “to bring in (or study) any deep-theory or
+> heavy-weight tools such as rlang/tidyeval”.
+For me, there isn't really any surprise that users are trying to avoid dplyr deep-theory. R
+users frequently are not programmers and learning to code is already hard business, on top
+of that, having to learn how to 'quote' or 'enquo' or 'quos' or 'enquos' is not necessarily
+a 'piece of cake'. So much so, that 'tidyeval' has some more advanced functions that instead
+of using quoted expressions, uses strings as arguments.
+In the following examples, we show the use of functions 'group\_by\_at', 'summarise\_at' and
+'rename\_at' that receive strings as argument. The data frame used in 'starwars' that describes
+features of characters in the Starwars movies:
+```ruby
+puts (~:starwars).head.as__data__frame
+```
+```
+##             name height mass  hair_color  skin_color eye_color birth_year
+## 1 Luke Skywalker    172   77       blond        fair      blue       19.0
+## 2          C-3PO    167   75        <NA>        gold    yellow      112.0
+## 3          R2-D2     96   32        <NA> white, blue       red       33.0
+## 4    Darth Vader    202  136        none       white    yellow       41.9
+## 5    Leia Organa    150   49       brown       light     brown       19.0
+## 6      Owen Lars    178  120 brown, grey       light      blue       52.0
+##   gender homeworld species
+## 1   male  Tatooine   Human
+## 2   <NA>  Tatooine   Droid
+## 3   <NA>     Naboo   Droid
+## 4   male  Tatooine   Human
+## 5 female  Alderaan   Human
+## 6   male  Tatooine   Human
+##                                                                                                                                       films
+## 1                                           Revenge of the Sith, Return of the Jedi, The Empire Strikes Back, A New Hope, The Force Awakens
+## 2                    Attack of the Clones, The Phantom Menace, Revenge of the Sith, Return of the Jedi, The Empire Strikes Back, A New Hope
+## 3 Attack of the Clones, The Phantom Menace, Revenge of the Sith, Return of the Jedi, The Empire Strikes Back, A New Hope, The Force Awakens
+## 4                                                              Revenge of the Sith, Return of the Jedi, The Empire Strikes Back, A New Hope
+## 5                                           Revenge of the Sith, Return of the Jedi, The Empire Strikes Back, A New Hope, The Force Awakens
+## 6                                                                                     Attack of the Clones, Revenge of the Sith, A New Hope
+##                             vehicles                starships
+## 1 Snowspeeder, Imperial Speeder Bike X-wing, Imperial shuttle
+## 2
+## 3
+## 4                                             TIE Advanced x1
+## 5              Imperial Speeder Bike
+## 6
+```
+The grouped_mean function bellow will receive a grouping variable and calculate summaries for
+the value\_variables given:
+```r
+grouped_mean <- function(data, grouping_variables, value_variables) {
+  data %>%
+    group_by_at(grouping_variables) %>%
+    mutate(count = n()) %>%
+    summarise_at(c(value_variables, "count"), mean, na.rm = TRUE) %>%
+    rename_at(value_variables, funs(paste0("mean_", .)))
+    }
+gm = starwars %>%
+   grouped_mean("eye_color", c("mass", "birth_year"))
+as.data.frame(gm)
+```
+```
+##        eye_color mean_mass mean_birth_year count
+## 1          black  76.28571        33.00000    10
+## 2           blue  86.51667        67.06923    19
+## 3      blue-gray  77.00000        57.00000     1
+## 4          brown  66.09231       108.96429    21
+## 5           dark       NaN             NaN     1
+## 6           gold       NaN             NaN     1
+## 7  green, yellow 159.00000             NaN     1
+## 8          hazel  66.00000        34.50000     3
+## 9         orange 282.33333       231.00000     8
+## 10          pink       NaN             NaN     1
+## 11           red  81.40000        33.66667     5
+## 12     red, blue       NaN             NaN     1
+## 13       unknown  31.50000             NaN     3
+## 14         white  48.00000             NaN     1
+## 15        yellow  81.11111        76.38000    11
+```
+The same code with Galaaz, becomes:
+```ruby
+def grouped_mean(data, grouping_variables, value_variables)
+  data.
+    group_by_at(grouping_variables).
+    mutate(count: E.n).
+    summarise_at(E.c(value_variables, "count"), ~:mean, na__rm: true).
+    rename_at(value_variables, E.funs(E.paste0("mean_", value_variables)))
+end
+puts grouped_mean((~:starwars), "eye_color", E.c("mass", "birth_year")).as__data__frame
+```
+```
+##        eye_color mean_mass mean_birth_year count
+## 1          black  76.28571        33.00000    10
+## 2           blue  86.51667        67.06923    19
+## 3      blue-gray  77.00000        57.00000     1
+## 4          brown  66.09231       108.96429    21
+## 5           dark       NaN             NaN     1
+## 6           gold       NaN             NaN     1
+## 7  green, yellow 159.00000             NaN     1
+## 8          hazel  66.00000        34.50000     3
+## 9         orange 282.33333       231.00000     8
+## 10          pink       NaN             NaN     1
+## 11           red  81.40000        33.66667     5
+## 12     red, blue       NaN             NaN     1
+## 13       unknown  31.50000             NaN     3
+## 14         white  48.00000             NaN     1
+## 15        yellow  81.11111        76.38000    11
+```
 [TO BE CONTINUED...]