RubyGems - galaaz - Versions diffs - 0.4.6 → 0.5.0 - Mend

galaaz 0.4.6 → 0.5.0

Files changed (181) hide show

checksums.yaml +5 -5
data/README.md +3575 -118
data/Rakefile +21 -4
data/bin/gknit +152 -6
data/bin/gknit-draft +105 -0
data/bin/gknit-draft.rb +28 -0
data/bin/gknit_Rscript +127 -0
data/bin/grun +27 -1
data/bin/gstudio +47 -4
data/bin/{gstudio.rb → gstudio_irb.rb} +0 -0
data/bin/gstudio_pry.rb +7 -0
data/blogs/galaaz_ggplot/galaaz_ggplot.Rmd +3 -12
data/blogs/galaaz_ggplot/galaaz_ggplot.html +77 -222
data/blogs/galaaz_ggplot/galaaz_ggplot.md +4 -31
data/blogs/galaaz_ggplot/galaaz_ggplot.pdf +0 -0
data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-html/midwest_rb.png +0 -0
data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-html/scatter_plot_rb.png +0 -0
data/blogs/galaaz_ggplot/midwest.Rmd +1 -9
data/blogs/gknit/gknit.Rmd +232 -123
data/blogs/{dev/dev.html → gknit/gknit.html} +1897 -33
data/blogs/gknit/gknit.pdf +0 -0
data/blogs/gknit/lst.rds +0 -0
data/blogs/gknit/stats.bib +27 -0
data/blogs/manual/lst.rds +0 -0
data/blogs/manual/manual.Rmd +1893 -47
data/blogs/manual/manual.html +3153 -347
data/blogs/manual/manual.md +3575 -118
data/blogs/manual/manual.pdf +0 -0
data/blogs/manual/manual.tex +4026 -0
data/blogs/manual/manual_files/figure-html/bubble-1.png +0 -0
data/blogs/manual/manual_files/figure-html/diverging_bar.png +0 -0
data/blogs/manual/manual_files/figure-latex/bubble-1.png +0 -0
data/blogs/manual/manual_files/figure-latex/diverging_bar.pdf +0 -0
data/blogs/{dev → manual}/model.rb +0 -0
data/blogs/nse_dplyr/nse_dplyr.Rmd +849 -0
data/blogs/nse_dplyr/nse_dplyr.html +878 -0
data/blogs/nse_dplyr/nse_dplyr.md +1198 -0
data/blogs/nse_dplyr/nse_dplyr.pdf +0 -0
data/blogs/oh_my/oh_my.html +274 -386
data/blogs/oh_my/oh_my.md +208 -205
data/blogs/ruby_plot/ruby_plot.Rmd +64 -84
data/blogs/ruby_plot/ruby_plot.html +235 -208
data/blogs/ruby_plot/ruby_plot.md +239 -34
data/blogs/ruby_plot/ruby_plot.pdf +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/dose_len.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_delivery.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_dose.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color2.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_decorations.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_jitter.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_points.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/final_box_plot.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/final_violin_plot.png +0 -0
data/blogs/ruby_plot/ruby_plot_files/figure-html/violin_with_jitter.png +0 -0
data/examples/Bibliography/master.bib +50 -0
data/examples/Bibliography/stats.bib +72 -0
data/examples/islr/ch2.spec.rb +1 -1
data/examples/islr/ch3_boston.rb +4 -4
data/examples/islr/x_y_rnorm.jpg +0 -0
data/examples/latex_templates/Test-acm_article/Makefile +16 -0
data/examples/latex_templates/Test-acm_article/Test-acm_article.Rmd +65 -0
data/examples/latex_templates/Test-acm_article/acm_proc_article-sp.cls +1670 -0
data/examples/latex_templates/Test-acm_article/sensys-abstract.cls +703 -0
data/examples/latex_templates/Test-acm_article/sigproc.bib +59 -0
data/examples/latex_templates/Test-acs_article/Test-acs_article.Rmd +260 -0
data/examples/latex_templates/Test-acs_article/Test-acs_article.pdf +0 -0
data/examples/latex_templates/Test-acs_article/acs-Test-acs_article.bib +11 -0
data/examples/latex_templates/Test-acs_article/acs-my_output.bib +11 -0
data/examples/latex_templates/Test-acs_article/acstest.bib +17 -0
data/examples/latex_templates/Test-aea_article/AEA.cls +1414 -0
data/examples/latex_templates/Test-aea_article/BibFile.bib +0 -0
data/examples/latex_templates/Test-aea_article/Test-aea_article.Rmd +108 -0
data/examples/latex_templates/Test-aea_article/Test-aea_article.pdf +0 -0
data/examples/latex_templates/Test-aea_article/aea.bst +1269 -0
data/examples/latex_templates/Test-aea_article/multicol.sty +853 -0
data/examples/latex_templates/Test-aea_article/references.bib +0 -0
data/examples/latex_templates/Test-aea_article/setspace.sty +546 -0
data/examples/latex_templates/Test-amq_article/Test-amq_article.Rmd +256 -0
data/examples/latex_templates/Test-amq_article/Test-amq_article.pdf +0 -0
data/examples/latex_templates/Test-amq_article/Test-amq_article.pdfsync +3397 -0
data/examples/latex_templates/Test-amq_article/pics/Figure2.pdf +0 -0
data/examples/latex_templates/Test-ams_article/Test-ams_article.Rmd +215 -0
data/examples/latex_templates/Test-ams_article/amstest.bib +436 -0
data/examples/latex_templates/Test-asa_article/Test-asa_article.Rmd +153 -0
data/examples/latex_templates/Test-asa_article/Test-asa_article.pdf +0 -0
data/examples/latex_templates/Test-asa_article/agsm.bst +1353 -0
data/examples/latex_templates/Test-asa_article/bibliography.bib +233 -0
data/examples/latex_templates/Test-ieee_article/IEEEtran.bst +2409 -0
data/examples/latex_templates/Test-ieee_article/IEEEtran.cls +6346 -0
data/examples/latex_templates/Test-ieee_article/Test-ieee_article.Rmd +175 -0
data/examples/latex_templates/Test-ieee_article/Test-ieee_article.pdf +0 -0
data/examples/latex_templates/Test-ieee_article/mybibfile.bib +20 -0
data/examples/latex_templates/Test-rjournal_article/RJournal.sty +335 -0
data/examples/latex_templates/Test-rjournal_article/RJreferences.bib +18 -0
data/examples/latex_templates/Test-rjournal_article/RJwrapper.pdf +0 -0
data/examples/latex_templates/Test-rjournal_article/Test-rjournal_article.Rmd +52 -0
data/examples/latex_templates/Test-springer_article/Test-springer_article.Rmd +65 -0
data/examples/latex_templates/Test-springer_article/Test-springer_article.pdf +0 -0
data/examples/latex_templates/Test-springer_article/bibliography.bib +26 -0
data/examples/latex_templates/Test-springer_article/spbasic.bst +1658 -0
data/examples/latex_templates/Test-springer_article/spmpsci.bst +1512 -0
data/examples/latex_templates/Test-springer_article/spphys.bst +1443 -0
data/examples/latex_templates/Test-springer_article/svglov3.clo +113 -0
data/examples/latex_templates/Test-springer_article/svjour3.cls +1431 -0
data/examples/misc/moneyball.rb +1 -1
data/examples/misc/subsetting.rb +37 -37
data/examples/rmarkdown/svm-rmarkdown-anon-ms-example/svm-rmarkdown-anon-ms-example.Rmd +73 -0
data/examples/rmarkdown/svm-rmarkdown-anon-ms-example/svm-rmarkdown-anon-ms-example.pdf +0 -0
data/examples/rmarkdown/svm-rmarkdown-article-example/svm-rmarkdown-article-example.Rmd +382 -0
data/examples/rmarkdown/svm-rmarkdown-article-example/svm-rmarkdown-article-example.pdf +0 -0
data/examples/rmarkdown/svm-rmarkdown-beamer-example/svm-rmarkdown-beamer-example.Rmd +164 -0
data/examples/rmarkdown/svm-rmarkdown-beamer-example/svm-rmarkdown-beamer-example.pdf +0 -0
data/examples/rmarkdown/svm-rmarkdown-cv/svm-rmarkdown-cv.Rmd +92 -0
data/examples/rmarkdown/svm-rmarkdown-cv/svm-rmarkdown-cv.pdf +0 -0
data/examples/rmarkdown/svm-rmarkdown-syllabus-example/attend-grade-relationships.csv +482 -0
data/examples/rmarkdown/svm-rmarkdown-syllabus-example/svm-rmarkdown-syllabus-example.Rmd +280 -0
data/examples/rmarkdown/svm-rmarkdown-syllabus-example/svm-rmarkdown-syllabus-example.pdf +0 -0
data/examples/rmarkdown/svm-xaringan-example/svm-xaringan-example.Rmd +386 -0
data/lib/R_interface/r.rb +2 -2
data/lib/R_interface/r_libs.R +6 -1
data/lib/R_interface/r_methods.rb +12 -2
data/lib/R_interface/rdata_frame.rb +8 -17
data/lib/R_interface/rindexed_object.rb +1 -2
data/lib/R_interface/rlist.rb +1 -0
data/lib/R_interface/robject.rb +20 -23
data/lib/R_interface/rpkg.rb +15 -6
data/lib/R_interface/rsupport.rb +13 -19
data/lib/R_interface/ruby_extensions.rb +14 -18
data/lib/R_interface/rvector.rb +0 -12
data/lib/gknit.rb +2 -0
data/lib/gknit/draft.rb +105 -0
data/lib/gknit/knitr_engine.rb +6 -37
data/lib/util/exec_ruby.rb +22 -84
data/lib/util/inline_file.rb +7 -3
data/specs/figures/bg.jpeg +0 -0
data/specs/figures/bg.png +0 -0
data/specs/figures/bg.svg +2 -2
data/specs/figures/dose_len.png +0 -0
data/specs/figures/no_args.jpeg +0 -0
data/specs/figures/no_args.png +0 -0
data/specs/figures/no_args.svg +2 -2
data/specs/figures/width_height.jpeg +0 -0
data/specs/figures/width_height.png +0 -0
data/specs/figures/width_height_units1.jpeg +0 -0
data/specs/figures/width_height_units1.png +0 -0
data/specs/figures/width_height_units2.jpeg +0 -0
data/specs/figures/width_height_units2.png +0 -0
data/specs/r_dataframe.spec.rb +184 -11
data/specs/r_list.spec.rb +4 -4
data/specs/r_list_apply.spec.rb +11 -10
data/specs/ruby_expression.spec.rb +3 -11
data/specs/tmp.rb +106 -34
data/version.rb +1 -1
metadata +96 -33
data/bin/gknit_old_r +0 -236
data/blogs/dev/dev.Rmd +0 -77
data/blogs/dev/dev.md +0 -87
data/blogs/dev/dev_files/figure-html/bubble-1.png +0 -0
data/blogs/dev/dev_files/figure-html/diverging_bar. +0 -0
data/blogs/dev/dev_files/figure-html/diverging_bar.png +0 -0
data/blogs/dplyr/dplyr.rb +0 -63
data/blogs/galaaz_ggplot/galaaz_ggplot.aux +0 -43
data/blogs/galaaz_ggplot/galaaz_ggplot.log +0 -640
data/blogs/galaaz_ggplot/galaaz_ggplot.out +0 -10
data/blogs/galaaz_ggplot/galaaz_ggplot.tex +0 -481
data/blogs/galaaz_ggplot/midwest.png +0 -0
data/blogs/galaaz_ggplot/scatter_plot.png +0 -0
data/blogs/ruby_plot/ruby_plot.Rmd_external_figs +0 -662
data/blogs/ruby_plot/ruby_plot.tex +0 -1077
data/blogs/ruby_plot/ruby_plot_files/figure-html/dose_len.svg +0 -57
data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_delivery.svg +0 -106
data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_dose.svg +0 -110
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color.svg +0 -174
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color2.svg +0 -236
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_jitter.svg +0 -296
data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_points.svg +0 -236
data/blogs/ruby_plot/ruby_plot_files/figure-html/final_box_plot.svg +0 -218
data/blogs/ruby_plot/ruby_plot_files/figure-html/final_violin_plot.svg +0 -128
data/blogs/ruby_plot/ruby_plot_files/figure-html/violin_with_jitter.svg +0 -150
data/examples/paper/paper.rb +0 -36

Binary file

@@ -0,0 +1,27 @@
+@book{Wilkinson:grammar_of_graphics,
+ author = {Wilkinson, Leland},
+ title = {The Grammar of Graphics (Statistics and Computing)},
+ year = {2005},
+ isbn = {0387245448},
+ publisher = {Springer-Verlag},
+ address = {Berlin, Heidelberg},
+}
+@article{Knuth:literate_programming,
+ author = {Knuth, Donald E.},
+ title = {Literate Programming},
+ journal = {Comput. J.},
+ issue_date = {May 1984},
+ volume = {27},
+ number = {2},
+ month = may,
+ year = {1984},
+ issn = {0010-4620},
+ pages = {97--111},
+ numpages = {15},
+ url = {http://dx.doi.org/10.1093/comjnl/27.2.97},
+ doi = {10.1093/comjnl/27.2.97},
+ acmid = {479},
+ publisher = {Oxford University Press},
+ address = {Oxford, UK},
+}

data/blogs/manual/lst.rds ADDED

Binary file

data/blogs/manual/manual.Rmd CHANGED

@@ -4,27 +4,28 @@ subtitle: "How to tightly couple Ruby and R in GraalVM"
 author: "Rodrigo Botafogo"
 tags: [Galaaz, Ruby, R, TruffleRuby, FastR, GraalVM, ggplot2]
 date: "2019"
+bibliography: "/home/rbotafogo/Bibliography/stats.bib"
 output:
-  html_document:
-    self_contained: true
-    keep_md: true
-  md_document:
-    variant: markdown_github
   pdf_document:
     includes:
       in_header: "../../sty/galaaz.sty"
     keep_tex: yes
     number_sections: yes
     toc: true
-    toc_depth: 2
+    toc_depth: 3
+  html_document:
+    self_contained: true
+    keep_md: true
+  md_document:
+    variant: markdown_github
 fontsize: 11pt
 ---
 ```{ruby setup, echo=FALSE}
+R.options(crayon__enabled: false)
 R.install_and_loads('kableExtra')
 ```
 # Introduction
 Galaaz is a system for tightly coupling Ruby and R. Ruby is a powerful language, with a large
@@ -34,6 +35,92 @@ other hand, R is considered one of the most powerful languages for solving all o
 problems. Maybe the strongest competitor to R is Python with libraries such as NumPy,
 Panda, SciPy, SciKit-Learn and a couple more.
+With Galaaz we do not intend to re-implement any of the scientific libraries in R, we allow
+for very tight coupling between the two languages to the point that the Ruby developer does
+not need to know that there is an R engine running.
+According to Wikipedia "Ruby is a dynamic, interpreted, reflective, object-oriented,
+general-purpose programming language. It was designed and developed in the mid-1990s by Yukihiro
+"Matz" Matsumoto in Japan."  It reached high popularity with the development of Ruby on Rails
+(RoR) by David Heinemeier Hansson. RoR is a web application framework first released
+around 2005. It makes extensive use of Ruby's metaprogramming features.  With RoR,
+Ruby became very popular.  According to [Ruby's Tiobe index](https://www.tiobe.com/tiobe-index/ruby/)
+it peeked in popularity around 2008, then declined until 2015 when it started picking up again.
+At the time of this writing (November 2018), the Tiobe index puts Ruby in 16th position as
+most popular language.
+Python, a language similar to Ruby, ranks 4th in the index.  Java, C and C++ take the
+first three positions.  Ruby is often criticized for its focus on web applications.
+But Ruby can do [much more](https://github.com/markets/awesome-ruby) than just web applications.
+Yet, for scientific computing, Ruby lags way behind Python and R.  Python has
+Django framework for web, NumPy for numerical arrays, Pandas for data analysis.
+R is a free software environment for statistical computing and graphics with thousands
+of libraries for data analysis.
+Until recently, there was no real perspective for Ruby to bridge this gap.
+Implementing a complete scientific computing infrastructure would take too long.
+Enters [Oracle's GraalVM](https://www.graalvm.org/):
+> GraalVM is a universal virtual machine for running applications written in
+> JavaScript, Python 3, Ruby, R, JVM-based languages like Java, Scala, Kotlin,
+> and LLVM-based languages such as C and C++.
+>
+> GraalVM removes the isolation between programming languages and enables
+> interoperability in a shared runtime. It can run either standalone or in the
+> context of OpenJDK, Node.js, Oracle Database, or MySQL.
+>
+> GraalVM allows you to write polyglot applications with a seamless way to pass
+> values from one language to another. With GraalVM there is no copying or
+> marshaling necessary as it is with other polyglot systems. This lets you
+> achieve high performance when language boundaries are crossed. Most of the time
+> there is no additional cost for crossing a language boundary at all.
+>
+> Often developers have to make uncomfortable compromises that require them
+> to rewrite their software in other languages. For example:
+>
+>  * That library is not available in my language. I need to rewrite it.
+>  * That language would be the perfect fit for my problem, but we cannot
+>    run it in our environment.
+>  * That problem is already solved in my language, but the language is
+>    too slow.
+>
+>  With GraalVM we aim to allow developers to freely choose the right language for
+>  the task at hand without making compromises.
+As stated above, GraalVM is a _universal_ virtual machine that allows Ruby and R (and other
+languages) to run on the same environment.  GraalVM allows polyglot applications to
+_seamlessly_ interact with one another and pass values from one language to the other.
+Although a great idea, GraalVM still requires application writers to know several languages.
+To eliminate that requirement, we built Galaaz, a gem for Ruby, to tightly couple
+Ruby and R and allow those languages to interact in a way that the user will be unaware
+of such interaction. In other words, a Ruby programmer will be able to use all
+the capabilities of R without knowing the R syntax.
+Library wrapping is a usual way of bringing features from one language into another.
+To improve performance, Python often wraps more efficient C libraries. For the
+Python developer, the existence of such C libraries is hidden.  The problem with
+library wrapping is that for any new library, there is the need to handcraft a new
+wrapper.
+Galaaz, instead of wrapping a single C or R library, wraps the whole R language
+in Ruby.  Doing so, all thousands of R libraries are available immediately
+to Ruby developers without any new wrapping effort.
+## What does Galaaz mean
+Galaaz is the Portuguese name for "Galahad".  From Wikipedia:
+    Sir Galahad (sometimes referred to as Galeas or Galath),
+    in Arthurian legend, is a knight of King Arthur's Round Table and one
+    of the three achievers of the Holy Grail. He is the illegitimate son
+    of Sir Lancelot and Elaine of Corbenic, and is renowned for his
+    gallantry and purity as the most perfect of all knights. Emerging quite
+    late in the medieval Arthurian tradition, Sir Galahad first appears in the
+    Lancelot–Grail cycle, and his story is taken up in later works such as
+    the Post-Vulgate Cycle and Sir Thomas Malory's Le Morte d'Arthur.
+    His name should not be mistaken with Galehaut, a different knight from
+    Arthurian legend.
 # System Compatibility
 * Oracle Linux 7
@@ -84,7 +171,7 @@ Panda, SciPy, SciKit-Learn and a couple more.
   > galaaz -T
   Shows a list with all available executalbe tasks.  To execute a task, substitute the
-  'rake' word in the list with 'galaaz'.  For instance, the following line shows up
+   'rake' word in the list with 'galaaz'.  For instance, the following line shows up
   after 'galaaz -T'
   rake master_list:scatter_plot        # scatter_plot from:....
@@ -93,9 +180,711 @@ Panda, SciPy, SciKit-Learn and a couple more.
   > galaaz master_list:scatter_plot
-# Basic Types
-## Vectors
+# Accessing R from Ruby
+One of the nice aspects of Galaaz on GraalVM, is that variables and functions defined in R, can
+be easily accessed from Ruby.  For instance, to access the 'mtcars' data frame from R
+in Ruby, we use the ':mtcar' symbol preceded by the '~' operator, thus '~:r_vec' retrieves the
+value of the 'mtcars' variable.
+```{ruby access_r}
+puts ~:mtcars
+```
+To access an R function from Ruby, the R function needs to be preceeded by 'R.' scoping.
+Bellow we see and example of creating a R::Vector by calling the 'c' R function
+```{ruby call_r_func}
+puts vec = R.c(1.0, 2.0, 3.0, 4.0)
+```
+Note that 'vec' is an object of type R::Vector:
+```{ruby r_object}
+puts vec.class
+```
+Every object created by a call to an R function will be of a type that inherits from
+R::Object. In R, there is also a function 'class'. In order to access that function we
+can call method 'rclass' in the R::Object:
+```{ruby rclass}
+puts vec.rclass
+```
+When working with R::Object(s), it is possible to use the '.' operator to pipe operations.
+When using '.', the object to which the '.' is applied becomes the first argument of the
+corresponding R function. For instance, function 'c' in R, can be used to concatenate
+two vectors or more vectors (in R, there are no scalar values, scalars are converted to
+vectors of size 1. Within Galaaz, scalar parameter is converted to a size one vector):
+```{ruby concat}
+puts R.c(vec, 10, 20, 30)
+```
+The call above to the 'c' function can also be done using '.' notation:
+```{ruby concat_with_dot}
+puts vec.c(10, 20, 30)
+```
+We will talk about vector indexing in a latter section. But notice here that indexing
+an R::Vector will return another R::Vector:
+```{ruby indexing}
+puts vec[1]
+```
+Sometimes we want to index an R::Object and get back a Ruby object that is not wrapped
+in an R::Object, but the native Ruby object. For this, we can index the R object with
+the '>>' operator:
+```{ruby native_value}
+puts vec >> 0
+puts vec >> 2
+```
+It is also possible to call an R function with named arguments, by creating the function
+in Galaaz with named parameters. For instance, here is an example of creating a 'list'
+with named elements:
+```{ruby named_parameters}
+puts R.list(first_name: "Rodrigo", last_name: "Botafogo")
+```
+Many R functions receive another function as argument. For instance, method 'map' applies
+a function to every element of a vector. With Galaaz, it is possible to pass a Proc,
+Method or Lambda in place of the expected R function. In this next example, we will
+add 2 to every element of our previously created vector:
+```{ruby proc_as_param}
+puts vec.map { |x| x + 2 }
+```
+# gKnitting a Document
+This manual has been formatted usign gKnit.  gKnit uses Knitr and R markdown to knit
+a document in Ruby or R and output it in any of the available formats for R markdown.
+gKnit runs atop of GraalVM, and Galaaz.  In gKnit, Ruby variables are persisted between
+chunks, making it an ideal solution for literate programming. Also, since it is based
+on Galaaz, Ruby chunks can have access to R variables and Polyglot Programming with
+Ruby and R is quite natural.
+The idea of "literate programming" was first introduced by Donald Knuth in the
+1980's [@Knuth:literate_programming].
+The main intention of this approach was to develop software interspersing macro snippets,
+traditional source code, and a natural language such as English in a document
+that could be compiled into
+executable code and at the same time easily read by a human developer. According to Knuth
+"The practitioner of
+literate programming can be regarded as an essayist, whose main concern is with exposition
+and excellence of style."
+The idea of literate programming evolved into the idea of reproducible research, in which
+all the data, software code, documentation, graphics etc. needed to reproduce the research
+and its reports could be included in a
+single document or set of documents that when distributed to peers could be rerun generating
+the same output and reports.
+The R community has put a great deal of effort in reproducible research.  In 2002, Sweave was
+introduced and it allowed mixing R code with Latex generating high quality PDF documents.  A
+Sweave document could include code, the results of executing the code, graphics and text
+such that it contained the whole narrative to reproduce the research.  In
+2012, Knitr, developed by Yihui Xie from RStudio was released to replace Sweave and to
+consolidate in one single package the many extensions and add-on packages that
+were necessary for Sweave.
+With Knitr, __R markdown__ was also developed, an extension to the
+Markdown format.  With __R markdown__ and Knitr it is possible to generate reports in a multitude
+of formats such as HTML, markdown, Latex, PDF, dvi, etc.  __R markdown__ also allows the use of
+multiple programming languages such as R, Ruby, Python, etc. in the same document.
+In __R markdown__, text is interspersed with
+code chunks that can be executed and both the code and its results can become
+part of the final report.  Although __R markdown__ allows multiple programming languages in the
+same document, only R and Python (with
+the reticulate package) can persist variables between chunks.  For other languages, such as
+Ruby, every chunk will start a new process and thus all data is lost between chunks, unless it
+is somehow stored in a data file that is read by the next chunk.
+Being able to persist data
+between chunks is critical for literate programming otherwise the flow of the narrative is lost
+by all the effort of having to save data and then reload it. Although this might, at first, seem like
+a small nuisance, not being able to persist data between chunks is a major issue. For example, let's
+take a look at the following simple example in which we want to show how to create a list and the
+use it.  Let's first assume that data cannot be persisted between chunks.  In the next chunk we
+create a list, then we would need to save it to file, but to save it, we need somehow to marshal the
+data into a binary format:
+```{ruby no_persistence}
+lst = R.list(a: 1, b: 2, c: 3)
+lst.saveRDS("lst.rds")
+```
+then, on the next chunk, where variable 'lst' is used, we need to read back it's value
+```{ruby load_persisted_data}
+lst = R.readRDS("lst.rds")
+puts lst
+```
+Now, any single code has dozens of variables that we might want to use and reuse between chunks.
+Clearly, such an approach becomes quickly unmanageable. Probably, because of
+this problem, it is very rare to see any __R markdown__ document in the Ruby community.
+When variables can be used accross chunks, then no overhead is needed:
+```{ruby persistence}
+lst = R.list(a: 1, b: 2, c: 3)
+# any other code can be added here
+```
+```{ruby use_var}
+puts lst
+```
+In the Python community, the same effort to have code and text in an integrated environment
+started around the first decade of 2000. In 2006 iPython 0.7.2 was released.  In 2014,
+Fernando Pérez, spun off project Jupyter from iPython creating a web-based interactive
+computation environment.  Jupyter can now be used with many languages, including Ruby with the
+iruby gem (https://github.com/SciRuby/iruby).  In order to have multiple languages in a Jupyter
+notebook the SoS kernel was developed (https://vatlab.github.io/sos-docs/).
+## gKnit and __R markdown__
+gKnit is based on knitr and __R markdown__ and can knit a document
+written both in Ruby and/or R and output it in any of the available formats of __R markdown__.  gKnit
+allows ruby developers to do literate programming and reproducible research by allowing them to
+have in a single document, text and code.
+In gKnit, Ruby variables are persisted between
+chunks, making  it an ideal solution for literate programming in this language.  Also,
+since it is based on  Galaaz, Ruby chunks can have access to R variables and Polyglot Programming
+with Ruby and R is quite natural.
+This is not a blog post on __R markdown__, and the interested user is directed to the following links
+for detailed information on its capabilities and use.
+* https://rmarkdown.rstudio.com/ or
+* https://bookdown.org/yihui/rmarkdown/
+In this post, we will describe just the main aspects of __R markdown__, so the user can start
+gKnitting Ruby and R documents quickly.
+## The Yaml header
+An __R markdown__ document should start with a Yaml header and be stored in a file with
+'.Rmd' extension. This document has the following header for gKitting an HTML document.
+```
+---
+title: "How to do reproducible research in Ruby with gKnit"
+author:
+    - "Rodrigo Botafogo"
+    - "Daniel Mossé - University of Pittsburgh"
+tags: [Tech, Data Science, Ruby, R, GraalVM]
+date: "20/02/2019"
+output:
+  html_document:
+    self_contained: true
+    keep_md: true
+  pdf_document:
+    includes:
+      in_header: ["../../sty/galaaz.sty"]
+    number_sections: yes
+---
+```
+For more information on the options in the Yaml header, [check here](https://bookdown.org/yihui/rmarkdown/html-document.html).
+## __R Markdown__ formatting
+Document formatting can be done with simple markups such as:
+## Headers
+```
+# Header 1
+## Header 2
+### Header 3
+```
+## Lists
+```
+Unordered lists:
+* Item 1
+* Item 2
+    + Item 2a
+    + Item 2b
+```
+```
+Ordered Lists
+1. Item 1
+2. Item 2
+3. Item 3
+    + Item 3a
+    + Item 3b
+```
+For more R markdown formatting go to https://rmarkdown.rstudio.com/authoring_basics.html.
+## R chunks
+Running and executing Ruby and R code is actually what really interests us is this blog.
+Inserting a code chunk is done by adding code in a block delimited by three back ticks
+followed by an open
+curly brace ('{') followed with the engine name (r, ruby, rb, include, ...), an
+any optional chunk_label and options, as shown bellow:
+````
+```{engine_name [chunk_label], [chunk_options]}`r ''`
+```
+````
+for instance, let's add an R chunk to the document labeled 'first_r_chunk'.  This is
+a very simple code just to create a variable and print it out, as follows:
+````
+```{r first_r_chunk}`r ''`
+vec <- c(1, 2, 3)
+print(vec)
+```
+````
+If this block is added to an __R markdown__ document and gKnitted the result will be:
+```{r first_r_chunk}
+vec <- c(1, 2, 3)
+print(vec)
+```
+Now let's say that we want to do some analysis in the code, but just print the result and not the
+code itself.  For this, we need to add the option 'echo = FALSE'.
+````
+```{r second_r_chunk, echo = FALSE}`r ''`
+vec2 <- c(10, 20, 30)
+vec3 <- vec * vec2
+print(vec3)
+```
+````
+Here is how this block will show up in the document. Observe that the code is not shown
+and we only see the execution result in a white box
+```{r second_r_chunk, echo = FALSE}
+vec2 <- c(10, 20, 30)
+vec3 <- vec * vec2
+print(vec3)
+```
+A description of the available chunk options can be found in https://yihui.name/knitr/.
+Let's add another R chunk with a function definition.  In this example, a vector
+'r_vec' is created and
+a new function 'reduce_sum' is defined.  The chunk specification is
+````
+```{r data_creation}`r ''`
+r_vec <- c(1, 2, 3, 4, 5)
+reduce_sum <- function(...) {
+  Reduce(sum, as.list(...))
+}
+```
+````
+and this is how it will look like once executed.  From now on, to be concise in the
+presentation we will not show chunk definitions any longer.
+```{r data_creation}
+r_vec <- c(1, 2, 3, 4, 5)
+reduce_sum <- function(...) {
+  Reduce(sum, as.list(...))
+}
+```
+We can, possibly in another chunk, access the vector and call the function as follows:
+```{r using_previous}
+print(r_vec)
+print(reduce_sum(r_vec))
+```
+## R Graphics with ggplot
+In the following chunk, we create a bubble chart in R using ggplot and include it in
+this document.  Note that there is no directive in the code to include the image, this
+occurs automatically.  The 'mpg' dataframe is natively available to R and to Galaaz as
+well.
+For the reader not knowledgeable of ggplot, ggplot is a graphics library based on "the
+grammar of graphics" [@Wilkinson:grammar_of_graphics]. The idea of the grammar of graphics
+is to build a graphics by adding layers to the plot.  More information can be found in
+https://towardsdatascience.com/a-comprehensive-guide-to-the-grammar-of-graphics-for-effective-visualization-of-multi-dimensional-1f92b4ed4149.
+In the plot bellow the 'mpg' dataset from base R is used. "The data concerns city-cycle fuel
+consumption in miles per gallon, to be predicted in terms of 3 multivalued discrete and 5
+continuous attributes." (Quinlan, 1993)
+First, the 'mpg' dataset if filtered to extract only cars from the following manumactures: Audi, Ford,
+Honda, and Hyundai and stored in the 'mpg_select' variable.  Then, the selected dataframe is passed
+to the ggplot function specifying in the aesthetic method (aes) that 'displacement' (disp) should
+be plotted in the 'x' axis and 'city mileage' should be on the 'y' axis.  In the 'labs' layer we
+pass the 'title' and 'subtitle' for the plot.  To the basic plot 'g', geom\_jitter is added, that
+plots cars from the same manufactures with the same color (col=manufactures) and the size of the
+car point equal its high way consumption (size = hwy).  Finally, a last layer is plotter containing
+a linear regression line (method = "lm") for every manufacturer.
+```{r bubble, dev='png'}
+# load package and data
+library(ggplot2)
+data(mpg, package="ggplot2")
+mpg_select <- mpg[mpg$manufacturer %in% c("audi", "ford", "honda", "hyundai"), ]
+# Scatterplot
+theme_set(theme_bw())  # pre-set the bw theme.
+g <- ggplot(mpg_select, aes(displ, cty)) +
+  labs(subtitle="mpg: Displacement vs City Mileage",
+       title="Bubble chart")
+g + geom_jitter(aes(col=manufacturer, size=hwy)) +
+  geom_smooth(aes(col=manufacturer), method="lm", se=F)
+```
+## Ruby chunks
+Including a Ruby chunk is just as easy as including an R chunk in the document: just
+change the name of the engine to 'ruby'.  It is also possible to pass chunk options
+to the Ruby engine; however, this version does not accept all the options that are
+available to R chunks.  Future versions will add those options.
+````
+```{ruby first_ruby_chunk}`r ''`
+```
+````
+In this example, the ruby chunk is called 'first_ruby_chunk'.  One important
+aspect of chunk labels is that they cannot be duplicated.  If a chunk label is
+duplicated, gKnit will stop with an error.
+In the following chunk, variable 'a', 'b' and 'c' are standard Ruby variables
+and 'vec' and 'vec2' are two vectors created  by calling the 'c' method on the
+R module.
+In Galaaz, the R module allows us to access R functions transparently.  The 'c'
+function in R, is a function that concatenates its arguments making a vector.
+It
+should be clear that there is no requirement in gknit to call or use any R
+functions.  gKnit will knit standard Ruby code, or even general text without
+any code.
+```{ruby split_data}
+a = [1, 2, 3]
+b = "US$ 250.000"
+c = "The 'outputs' function"
+vec = R.c(1, 2, 3)
+vec2 = R.c(10, 20, 30)
+```
+In the next block, variables 'a', 'vec' and 'vec2' are used and printed.
+```{ruby split2}
+puts a
+puts vec * vec2
+```
+Note that 'a' is a standard Ruby Array and 'vec' and 'vec2' are vectors that behave accordingly,
+where multiplication works as expected.
+## Inline Ruby code
+When using a Ruby chunk, the code and the output are formatted in blocks as seen above.
+This formatting is not always desired.  Sometimes, we want to have the results of the
+Ruby evaluation included in the middle of a phrase. gKnit allows adding inline Ruby code
+with the 'rb' engine.  The following chunk specification will
+create and inline Ruby text:
+````
+This is some text with inline Ruby accessing variable 'b' which has value:
+```{rb puts "```{rb puts b}\n```"}
+```
+and is followed by some other text!
+````
+<div style="margin-bottom:30px;">
+</div>
+This is some text with inline Ruby accessing variable 'b' which has value:
+```{rb puts b}
+```
+and is followed by some other text!
+<div style="margin-bottom:30px;">
+</div>
+Note that it is important not to add any new line before of after the code
+block if we want everything to be in only one line, resulting in the following sentence
+with inline Ruby code.
+```{ruby heading, echo = FALSE}
+outputs "### #{c}"
+```
+He have previously used the standard 'puts' method in Ruby chunks in order produce
+output.  The result of a 'puts', as seen in all previous chunks that use it,  is formatted
+inside a white box that
+follows the code block. Many times however, we would like to do some processing in the
+Ruby chunk and have the result of this processing generate and output that is
+"included" in the document as if we had typed it in __R markdown__ document.
+For example, suppose we want to create a new heading in our document, but the heading
+phrase is the result of some code processing: maybe it's the first line of a file we are
+going to read.  Method 'outputs' adds its output as if typed in the __R markdown__ document.
+Take now a look at variable 'c' (it was defined in a previous block above) as
+'c = "The 'outputs' function".  "The 'outputs' function" is actually the name of this
+section and it was created using the 'outputs' function inside a Ruby chunk.
+The ruby chunk to generate this heading is:
+````
+```{ruby heading}`r ''`
+outputs "### #{c}"
+```
+````
+The three '###' is the way we add a Heading 3 in __R markdown__.
+### HTML Output from Ruby Chunks
+We've just seen the use of method 'outputs' to add text to the the __R markdown__
+document.  This technique can also be used to add HTML code to the document. In
+__R markdown__, any html code typed directly in the document will be properly rendered.
+Here, for instance, is a table definition in HTML and its output in the document:
+```
+<table style="width:100%">
+  <tr>
+    <th>Firstname</th>
+    <th>Lastname</th>
+    <th>Age</th>
+  </tr>
+  <tr>
+    <td>Jill</td>
+    <td>Smith</td>
+    <td>50</td>
+  </tr>
+  <tr>
+    <td>Eve</td>
+    <td>Jackson</td>
+    <td>94</td>
+  </tr>
+</table>
+```
+<div style="margin-bottom:30px;">
+</div>
+<table style="width:100%">
+  <tr>
+    <th>Firstname</th>
+    <th>Lastname</th>
+    <th>Age</th>
+  </tr>
+  <tr>
+    <td>Jill</td>
+    <td>Smith</td>
+    <td>50</td>
+  </tr>
+  <tr>
+    <td>Eve</td>
+    <td>Jackson</td>
+    <td>94</td>
+  </tr>
+</table>
+<div style="margin-bottom:30px;">
+</div>
+But manually creating HTML output is not always easy or desirable, specially
+if we intend the document to be rendered in other formats, for example, as Latex.
+Also, The above
+table looks ugly.  The 'kableExtra' library is a great library for
+creating beautiful tables. Take a look at https://cran.r-project.org/web/packages/kableExtra/vignettes/awesome_table_in_html.html
+In the next chunk, we output the 'mtcars' dataframe from R in a nicely formatted
+table.  Note that we retrieve the mtcars dataframe by using '~:mtcars'.
+```{ruby nice_table}
+R.install_and_loads('kableExtra')
+outputs (~:mtcars).kable.kable_styling
+```
+## Including Ruby files in a chunk
+R is a language that was created to be easy and fast for statisticians to use.  As far
+as I know, it was not a
+language to be used for developing large systems.  Of course, there are large systems and
+libraries in R, but the focus of the language is for developing statistical models and
+distribute that to peers.
+Ruby on the other hand, is a language for large software development.  Systems written in
+Ruby will have dozens, hundreds or even thousands of files.  To document a
+large system with literate programming, we cannot expect the developer to add all the
+files in a single '.Rmd' file.  gKnit provides the 'include' chunk engine to include
+a Ruby file as if it had being typed in the '.Rmd' file.
+To include a file, the following chunk should be created, where <filename> is the name of
+the file to be included and where the extension, if it is '.rb', does not need to be added.
+If the 'relative' option is not included, then it is treated as TRUE.  When 'relative' is
+true, ruby's 'require\_relative' semantics is used to load the file, when false, Ruby's
+\$LOAD_PATH is searched to find the file and it is 'require'd.
+````
+```{include <filename>, relative = <TRUE/FALSE>}`r ''`
+```
+````
+Bellow we include file 'model.rb', which is in the same directory of this blog.
+This code uses R 'caret' package to split a dataset in a train and test sets.
+The 'caret' package is a very important a useful package for doing Data Analysis,
+it has hundreds of functions for all steps of the Data Analysis workflow.  To
+use 'caret' just to split a dataset is like using the proverbial cannon to
+kill the fly.  We use it here only to show that integrating Ruby and R and
+using even a very complex package as 'caret' is trivial with Galaaz.
+A word of advice: the 'caret' package has lots of dependencies and installing
+it in a Linux system is a time consuming operation.  Method 'R.install_and_loads'
+will install the package if it is not already installed and can take a while.
+````
+```{include model}`r ''`
+```
+````
+```{include model}
+```
+```{ruby model_partition}
+mtcars = ~:mtcars
+model = Model.new(mtcars, percent_train: 0.8)
+model.partition(:mpg)
+puts model.train.head
+puts model.test.head
+```
+## Documenting Gems
+gKnit also allows developers to document and load files that are not in the same directory
+of the '.Rmd' file.
+Here is an example of loading the 'find.rb' file from TruffleRuby. In this example, relative
+is set to FALSE, so Ruby will look for the file in its $LOAD\_PATH, and the user does not
+need to no it's directory.
+````
+```{include find, relative = FALSE}`r ''`
+```
+````
+```{include find, relative = FALSE}
+```
+## Converting to PDF
+One of the beauties of knitr is that the same input can be converted to many different outputs.
+One very useful format, is, of course, PDF.  In order to converted an __R markdown__ file to PDF
+it is necessary to have LaTeX installed on the system.  We will not explain here how to
+install LaTeX as there are plenty of documents on the web showing how to proceed.
+gKnit comes with a simple LaTeX style file for gknitting this blog as a PDF document.  Here is
+the Yaml header to generate this blog in PDF format instead of HTML:
+```
+---
+title: "gKnit - Ruby and R Knitting with Galaaz in GraalVM"
+author: "Rodrigo Botafogo"
+tags: [Galaaz, Ruby, R, TruffleRuby, FastR, GraalVM, knitr, gknit]
+date: "29 October 2018"
+output:
+  pdf\_document:
+    includes:
+      in\_header: ["../../sty/galaaz.sty"]
+    number\_sections: yes
+---
+```
+## Template based documents generation
+When a document is converted to PDF it follows a certain convertion template. We've seen above
+the use of 'galaaz.sty' as a basic template to generate a PDF document.  Using the
+'gknit-draft' app that comes with Galaaz, the same .Rmd file can be compiled to different
+looking PDF documents. Galaaz automatically loads the 'rticles' R package that comes with
+templates for the following journals with the respective template name:
+* ACM articles: acm_article
+* ACS articles: acs_article
+* AEA journal submissions: aea_article
+* AGU journal submissions: ????
+* AMS articles: ams_article
+* American Statistical Association: asa_article
+* Biometrics articles: biometrics_article
+* Bulletin de l'AMQ journal submissions: amq_article
+* CTeX documents: ctex
+* Elsevier journal submissions: elsevier_article
+* IEEE Transaction journal submissions: ieee_article
+* JSS articles: jss_article
+* MDPI journal submissions: mdpi_article
+* Monthly Notices of the Royal Astronomical Society articles: mnras_article
+* NNRAS journal submissions: nmras_article
+* PeerJ articles: peerj_article
+* Royal Society Open Science journal submissions: rsos_article
+* Royal Statistical Society: rss_article
+* Sage journal submissions: sage_article
+* Springer journal submissions: springer_article
+* Statistics in Medicine journal submissions: sim_article
+* Copernicus Publications journal submissions: copernicus_article
+* The R Journal articles: rjournal_article
+* Frontiers articles: ???
+* Taylor & Francis articles: ???
+* Bulletin De L'AMQ: amq_article
+* PLOS journal: plos_article
+* Proceedings of the National Academy of Sciences of the USA: pnas_article
+In order to create a document with one of those templates, use the following command:
+```
+gknit-draft --filename <my_document> --template <template> --package <package>
+            --create_dir
+```
+So, in order to create a template for writing an R Journal, use:
+```
+gknit-draft --filename my_r_article --template rjournal_article --package rticles
+            --create_dir
+```
+# Accessing R variables
+Galaaz allows Ruby to access variables created in R.  For example, the 'mtcars' data set is
+available in R and can be accessed from Ruby by using the 'tilda' operator followed by the
+symbol for the variable, in this case ':mtcar'.  In the code bellow method 'outputs' is
+used to output the 'mtcars' data set nicely formatted in HTML by use of the 'kable' and
+'kable_styling' functions. Method 'outputs' is only available when used with 'gknit'.
+```{ruby view_kable}
+outputs (~:mtcars).kable.kable_styling
+```
+# Basic Data Types
+## Vector
 Vectors can be thought of as contiguous cells containing data. Cells are accessed through
 indexing operations such as x[5]. Galaaz has six basic (‘atomic’) vector types: logical,
@@ -120,20 +909,22 @@ vector is often referred to as a character string.
 To create a vector the 'c' (concatenate) method from the 'R' module should be used:
 ```{ruby integer}
-@vec = R.c(1, 2, 3)
-puts @vec
+vec = R.c(1, 2, 3)
+puts vec
 ```
-Lets take a look at the type, mode and storage.mode of our vector @vec.  In order to print
+Lets take a look at the type, mode and storage.mode of our vector vec.  In order to print
 this out, we are creating a data frame 'df' and printing it out.  A data frame, for those
-not familiar with it, it basically a table.  Here we create the data frame and add the
+not familiar with it, is basically a table.  Here we create the data frame and add the
 column name by passing named parameters for each column, such as 'typeof:', 'mode:' and
-'storage__mode'.  You should also note here that the double underscore is converted to a '.'.
+'storage__mode?'.  You should also note here that the double underscore is converted to a '.'.
+So, when printed 'storage\_\_mode' will actually print as 'storage.mode'.
-In R, the method used to create a data frame is 'data.frame', in Galaaz we use 'data__frame'.
+Data frames will later be more carefully described.  In R, the method used to create a
+data frame is 'data.frame', in Galaaz we use 'data\_\_frame'.
 ```{ruby typeof_integer}
-df = R.data__frame(typeof: @vec.typeof, mode: @vec.mode, storage__mode: @vec.storage__mode)
+df = R.data__frame(typeof: vec.typeof, mode: vec.mode, storage__mode: vec.storage__mode)
 puts df
 ```
@@ -143,12 +934,12 @@ like '1' is converted to float and to have an integer the R developer will use '
 follows normal Ruby rules and the number 1 is an integer and 1.0 is a float.
 ```{ruby float}
-@vec = R.c(1.0, 2, 3)
-puts @vec
+vec = R.c(1.0, 2, 3)
+puts vec
 ```
 ```{ruby typeof_float}
-df = R.data__frame(typeof: @vec.typeof, mode: @vec.mode, storage__mode: @vec.storage__mode)
+df = R.data__frame(typeof: vec.typeof, mode: vec.mode, storage__mode: vec.storage__mode)
 outputs df.kable.kable_styling
 ```
@@ -161,47 +952,1101 @@ of the error.
 vec = R.c(1, hello, 5)
 ```
-```{ruby view_kable}
-outputs (~:mtcars).kable.kable_styling
+Here is a vector with logical values
+```{ruby logical_vector}
+vec = R.c(true, true, false, false, true)
+puts vec
 ```
+### Combining Vectors
-## Graphics with ggplot
+The 'c' functions used to create vectors can also be used to combine two vectors:
-```{ruby diverging_bar}
-require 'ggplot'
+```{ruby combining_vectors}
+vec1 = R.c(10.0, 20.0, 30.0)
+vec2 = R.c(4.0, 5.0, 6.0)
+vec = R.c(vec1, vec2)
+puts vec
+```
+In galaaz, methods can be chainned (somewhat like the pipe operator in R %>%, but more generic).
+In this next example, method 'c' is chainned after 'vec1'.  This also looks like 'c' is a
+method of the vector, but in reallity, this is actually closer to the pipe operator.  When
+Galaaz identifies that 'c' is not a method of 'vec' it actually tries to call 'R.c' with
+'vec1' as the first argument concatenated with all the other available arguments.  The code
+bellow is automatically converted to the code above.
-R.theme_set R.theme_bw
+```{ruby chainning_methods}
+vec = vec1.c(vec2)
+puts vec
+```
-# Data Prep
-mtcars = ~:mtcars
-mtcars.car_name = R.rownames(:mtcars)
-# compute normalized mpg
-mtcars.mpg_z = ((mtcars.mpg - mtcars.mpg.mean)/mtcars.mpg.sd).round 2
-mtcars.mpg_type = mtcars.mpg_z < 0 ? "below" : "above"
-mtcars = mtcars[mtcars.mpg_z.order, :all]
-# convert to factor to retain sorted order in plot
-mtcars.car_name = mtcars.car_name.factor levels: mtcars.car_name
+### Vector Arithmetic
-# Diverging Barcharts
-gg = mtcars.ggplot(E.aes(x: :car_name, y: :mpg_z, label: :mpg_z)) +
-     R.geom_bar(E.aes(fill: :mpg_type), stat: 'identity',  width: 0.5) +
-     R.scale_fill_manual(name: "Mileage",
-                         labels: R.c("Above Average", "Below Average"),
-                         values: R.c("above": "#00ba38", "below": "#f8766d")) +
-     R.labs(subtitle: "Normalised mileage from 'mtcars'",
-            title: "Diverging Bars") +
-     R.coord_flip()
+Arithmetic operations on vectors are performed element by element:
-puts gg
+```{ruby vec_arith1}
+puts vec1 + vec2
 ```
+```{ruby mult}
+puts vec1 * 5
+```
-[TO BE CONTINUED...]
+When vectors have different length, a recycling rule is applied to the shorter vector:
+```{ruby recycle}
+vec3 = R.c(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0)
+puts vec4 = vec1 + vec3
+```
-# Contributing
+### Vector Indexing
+Vectors can be indexed by using the '[]' operator:
+```{ruby index}
+puts vec4[3]
+```
+We can also index a vector with another vector.  For example, in the code bellow, we take elements
+1, 3, 5, and 7 from vec3:
+```{ruby index_by_vector}
+puts vec4[R.c(1, 3, 5, 7)]
+```
+Repeating an index and having indices out of order is valid code:
+```{ruby repeated_index}
+puts vec4[R.c(1, 3, 3, 1)]
+```
+It is also possible to index a vector with a negative number or negative vector.  In these cases
+the indexed values are not returned:
+```{ruby neg_index}
+puts vec4[-3]
+puts vec4[-R.c(1, 3, 5, 7)]
+```
+If an index is out of range, a missing value (NA) will be reported.
+```{ruby out_of_range}
+puts vec4[30]
+```
+It is also possible to index a vector by range:
+```{ruby range}
+puts vec4[(2..5)]
+```
+Elements in a vector can be named using the 'names' attribute of a vector:
+```{ruby naming}
+full_name = R.c("Rodrigo", "A", "Botafogo")
+full_name.names = R.c("First", "Middle", "Last")
+puts full_name
+```
+Or it can also be named by using the 'c' function with named paramenters:
+```{ruby named_param}
+full_name = R.c(First: "Rodrigo", Middle: "A", Last: "Botafogo")
+puts full_name
+```
+### Extracting Native Ruby Types from a Vector
+Vectors created with 'R.c' are of class R::Vector.  You might have noticed that when indexing a
+vector, a new vector is returned, even if this vector has one single element. In order to use
+R::Vector with other ruby classes it might be necessary to extract the actual Ruby native type
+from the vector. In order to do this extraction the '>>' operator is used.
+```{ruby ruby_native}
+puts vec4
+puts vec4 >> 0
+puts vec4 >> 4
+```
+Note that indexing with '>>' starts at 0 and not at 1, also, we cannot do negative indexing.
+## Matrix
+A matrix is a collection of elements organized as a two dimensional table.  A matrix can be
+created by the 'matrix' function:
+```{ruby matrix}
+mat = R.matrix(R.c(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0),
+               nrow: 3,
+               ncol: 3)
+puts mat
+```
+Note that matrices data is organized by column first. It is possible to organize the matrix
+memory by row first passing an extra argument to the 'matrix' function:
+```{ruby matrix_rowfirst}
+mat_row = R.matrix(R.c(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0),
+                   nrow: 3,
+                   ncol: 3,
+                   byrow: true)
+puts mat_row
+```
+### Indexing a Matrix
+A matrix can be indexed by [row, column]:
+```{ruby matrix_index}
+puts mat_row[1, 1]
+puts mat_row[2, 3]
+```
+It is possible to index an entire row or column with the ':all' keyword
+```{ruby matrix_index_all}
+puts mat_row[1, :all]
+puts mat_row[:all, 2]
+```
+Indexing with a vector is also possible for matrices. In the following example we want
+rows 1 and 3 and columns 2 and 3 building a 2 x 2 matrix.
+```{ruby matrix_index_vector}
+puts mat_row[R.c(1, 3), R.c(2, 3)]
+```
+Matrices can be combined with functions 'rbind':
+```{ruby matrix_combine_rbind}
+puts mat_row.rbind(mat)
+```
+and 'cbind':
+```{ruby matrix_combine_cbind}
+puts mat_row.cbind(mat)
+```
+## List
+A list is a data structure that can contain sublists of different types, while vector and matrix
+can only hold one type of element.
+```{ruby list}
+nums = R.c(1.0, 2.0, 3.0)
+strs = R.c("a", "b", "c", "d")
+bool = R.c(true, true, false)
+lst = R.list(nums: nums, strs: strs, bool: bool)
+puts lst
+```
+Note that 'lst' elements are named elements.
+### List Indexing
+List indexing, also called slicing, is done using the '[]' operator and the '[[]]' operator. Let's
+first start with the '[]' operator. The list above has three sublist indexing with '[]' will
+return one of the sublists.
+```{ruby list_indexing}
+puts lst[1]
+```
+Note that when using '[]' a new list is returned.  When using the double square bracket operator
+the value returned is the actual element of the list in the given position and not a slice of
+the original list
+```{ruby list_indexing_single}
+puts lst[[1]]
+```
+When elements are named, as dones with lst, indexing can be done by name:
+```{ruby list_indexing_by_name}
+puts lst[['bool']][[1]] >> 0
+```
+In this example, first the 'bool' element of the list was extracted, not as a list, but as a vector,
+then the first element of the vector was extracted (note that vectors also accept the '[[]]'
+operator) and then the vector was indexed by its first element, extracting the native Ruby type.
+## Data Frame
+A data frame is a table like structure in which each column has the same number of
+rows. Data frames are the basic structure for storing data for data analysis.  We have already
+seen a data frame previously when we accessed variable '~:mtcars'.  In order to create a
+data frame, function 'data__frame' is used:
+```{ruby dataframe}
+df = R.data__frame(
+  year: R.c(2010, 2011, 2012),
+  income: R.c(1000.0, 1500.0, 2000.0))
+puts df
+```
+### Data Frame Indexing
+A data frame can be indexed the same way as a matrix, by using '[row, column]', where row and
+column can either be a numeric or the name of the row or column
+```{ruby dataframe_index}
+puts (~:mtcars).head
+puts (~:mtcars)[1, 2]
+puts (~:mtcars)['Datsun 710', 'mpg']
+```
+Extracting a column from a data frame as a vector can be done by using the double square bracket
+operator:
+```{ruby dataframe_column}
+puts (~:mtcars)[['mpg']]
+```
+A data frame column can also be accessed as if it were an instance variable of the data frame:
+```{ruby dataframe_instance_variable}
+puts (~:mtcars).mpg
+```
+Slicing a data frame can be done by indexing it with a vector (we use 'head' to reduce the
+output):
+```{ruby dataframe_column_slice}
+puts (~:mtcars)[R.c('mpg', 'hp')].head
+```
+A row slice can be obtained by indexing by row and using the ':all' keyword for the column:
+```{ruby dataframe_row_slice}
+puts (~:mtcars)[R.c('Datsun 710', 'Camaro Z28'), :all]
+```
+Finally, a data frame can also be indexed with a logical vector.  In this next example, the
+'am' column of :mtcars is compared with 0 (with method 'eq').  When 'am' is equal to 0 the
+car is automatic.  So, by doing '(~:mtcars).am.eq 0' a logical vector is created with
+'true' whenever 'am' is 0 and 'false' otherwise.
+```{ruby logical_vector_filter}
+# obtain a vector with 'true' for cars with automatic transmission
+automatic = (~:mtcars).am.eq 0
+puts automatic
+```
+Using this logical vector, the data frame is indexed, returning a new data frame in
+which all cars have automatic transmission.
+```{ruby dataframe_logical}
+# slice the data frame by using this vector
+puts (~:mtcars)[automatic, :all]
+```
+# Writing Expressions in Galaaz
+Galaaz extends Ruby to work with complex expressions, similar to R's expressions build with 'quote'
+(base R) or 'quo' (tidyverse).  Let's take a look at some of those expressions.
+## Expressions from operators
+The code bellow
+creates an expression summing two symbols
+```{ruby expressions}
+exp1 = :a + :b
+puts exp1
+```
+We can build any complex mathematical expression
+```{ruby expr2}
+exp2 = (:a + :b) * 2.0 + :c ** 2 / :z
+puts exp2
+```
+It is also possible to use inequality operators in building expressions
+```{ruby expr3}
+exp3 = (:a + :b) >= :z
+puts exp3
+```
+Galaaz provides both symbolic representations for operators, such as (>, <, !=) as functional
+notation for those operators such as (.gt, .ge, etc.).  So the same expression written
+above can also be written as
+```{ruby expr4}
+exp4 = (:a + :b).ge :z
+puts exp4
+```
+Two type of expression can only be created with the functional representation of the operators,
+those are expressions involving '==', and '='.  In order to write an expression involving '==' we
+need to use the method '.eq' and for '=' we need the function '.assign'
+```{ruby expr5}
+exp5 = (:a + :b).eq :z
+puts exp5
+```
+```{ruby expr6}
+exp6 = :y.assign :a + :b
+puts exp6
+```
+In general we think that using the functional notation is preferable to using the
+symbolic notation as otherwise, we end up writing invalid expressions such as
+```{ruby exp_wrong, warning=FALSE, eval=FALSE}
+exp_wrong = (:a + :b) == :z
+puts exp_wrong
+```
+and it might be difficult to understand what is going on here.  The problem lies with the fact that
+when using '==' we are comparing expression (:a + :b) to expression :z with '=='.  When the
+comparison is executed, the system tries to evaluate :a, :b and :z, and those symbols at
+this time are not bound to anything and we get a "object 'a' not found" message.
+If we only use functional notation, this type of error will not occur.
+## Expressions with R methods
+It is often necessary to create an expression that uses a method or function.  For instance, in
+mathematics, it's quite natural to write an expressin such as $y = sin(x)$. In this case, the
+'sin' function is part of the expression and should not immediately executed. Now, let's say
+that 'x' is an angle of 45$^\circ$ and we acttually want our expression to be $y = 0.850...$.
+When we want the function to be part of the expression, we call the function preceeding it
+by the letter E, such as 'E.sin(x)'
+```{ruby method_expression}
+exp7 = :y.assign E.sin(:x)
+puts exp7
+```
+Expressions can also be written using '.' notation:
+```{ruby expression_with_dot}
+exp8 = :y.assign :x.sin
+puts exp8
+```
+When a function has multiple arguments, the first one can be used before the '.':
+```{ruby expression_multiple_args}
+exp9 = :x.c(:y)
+puts exp9
+```
+## Evaluating an Expression
+Expressions can be evaluated by calling function 'eval' with a binding. A binding can be provided
+with a list:
+```{ruby eval_expression_list}
+exp = (:a + :b) * 2.0 + :c ** 2 / :z
+puts exp.eval(R.list(a: 10, b: 20, c: 30, z: 40))
+```
+... with a data frame:
+```{ruby eval_expression_df}
+df = R.data__frame(
+  a: R.c(1, 2, 3),
+  b: R.c(10, 20, 30),
+  c: R.c(100, 200, 300),
+  z: R.c(1000, 2000, 3000))
+puts exp.eval(df)
+```
+# Manipulating Data
+One of the major benefits of Galaaz is to bring strong data manipulation to Ruby. The following
+examples were extracted from Hardley's "R for Data Science" (https://r4ds.had.co.nz/). This
+is a highly recommended book for those not already familiar with the 'tidyverse' style of
+programming in R. In the sections to follow, we will limit ourselves to convert the R code to
+Galaaz.
+For these
+examples, we will investigate the nycflights13 data set available on the package by the
+same name.  We use function 'R.install\_and\_loads' that checks if the library is available
+locally, and if not, installs it. This data frame contains all 336,776 flights that
+departed from New York City in 2013. The data comes from the US Bureau of
+Transportation Statistics.
+Dplyr uses 'tibbles' in place of data frames; unfortunately, tibbles do not print yet properly in
+Galaaz due to a bug in fastR.  In order to print a tibble we need to convert it to a data frame
+using the 'as\_\_data__frame' method.
+```{ruby nycflights13}
+R.install_and_loads('nycflights13')
+R.library('dplyr')
+```
+```{ruby flights}
+flights = ~:flights
+puts flights.head
+```
+## Filtering rows with Filter
+In this example we filter the flights data set by giving to the filter function two expressions:
+the first :month.eq 1
+```{ruby filter_rows}
+puts flights.filter((:month.eq 1), (:day.eq 1)).head
+```
+## Logical Operators
+All flights that departed in November of December
+```{ruby nov_dec}
+puts flights.filter((:month.eq 11) | (:month.eq 12)).head
+```
+The same as above, but using the 'in' operator. In R, it is possible to define many operators
+by doing %<op>%. The %in% operator checks if a value is in a vector.  In order to use those
+operators from Galaaz the '._' method is used, where the first argument is the operator's
+symbol, in this case ':in' and the second argument is the vector:
+```{ruby in_op}
+puts flights.filter(:month._ :in, R.c(11, 12)).head
+```
+## Filtering with NA (Not Available)
+Let's first create a 'tibble' with a Not Available value (R::NA).  Tibbles are a modern
+version of a data frame and operate very similarly to one.  It differs in how it outputs
+the values and the result of some subsetting operations that are more consistent than
+what is obtained from data frame.
+```{ruby na_tibble}
+df = R.tibble(x: R.c(1, R::NA, 3))
+puts df
+```
+Now filtering by :x > 1 shows all lines that satisfy this condition, where the row with R:NA does
+not.
+```{ruby filter_na}
+puts df.filter(:x > 1)
+```
+To match an NA use method 'is__na'
+```{ruby with_na}
+puts df.filter((:x.is__na) | (:x > 1))
+```
+## Arrange Rows with arrange
+Arrange reorders the rows of a data frame by the given arguments.
+```{ruby arrange}
+puts flights.arrange(:year, :month, :day).head
+```
+To arrange in descending order, use function 'desc'
+```{ruby desc_arrange}
+puts flights.arrange(:dep_delay.desc).head
+```
+## Selecting columns
+To select specific columns from a dataset we use function 'select':
+```{ruby select}
+puts flights.select(:year, :month, :day).head
+```
+It is also possible to select column in a given range
+```{ruby select_range}
+puts flights.select(:year.up_to :day).head
+```
+Select all columns that start with a given name sequence
+```{ruby select_starts_with}
+puts flights.select(E.starts_with('arr')).head
+```
+Other functions that can be used:
+* ends_with("xyz"): matches names that end with “xyz”.
+* contains("ijk"): matches names that contain “ijk”.
+* matches("(.)\\1"): selects variables that match a regular expression. This one matches
+  any variables that contain repeated characters.
+* num_range("x", (1..3)): matches x1, x2 and x3
+A helper function that comes in handy when we just want to rearrange column order is 'Everything':
+```{ruby everything}
+puts flights.select(:year, :month, :day, E.everything).head
+```
+## Add variables to a dataframe with 'mutate'
+```{ruby small_flights}
+flights_sm = flights.
+               select((:year.up_to :day),
+                      E.ends_with('delay'),
+                      :distance,
+                      :air_time)
+puts flights_sm.head
+```
+```{ruby mutate}
+flights_sm = flights_sm.
+               mutate(gain: :dep_delay - :arr_delay,
+                      speed: :distance / :air_time * 60)
+puts flights_sm.head
+```
+## Summarising data
+Function 'summarise' calculates summaries for the data frame. When no 'group_by' is used
+a single value is obtained from the data frame:
+```{ruby summarise}
+puts flights.summarise(delay: E.mean(:dep_delay, na__rm: true))
+```
+When a data frame is grouped with 'group_by' summaries apply to the given group:
+```{ruby summarise_group_by}
+by_day = flights.group_by(:year, :month, :day)
+puts by_day.summarise(delay: :dep_delay.mean(na__rm: true)).head
+```
+Next we put many operations together by pipping them one after the other:
+```{ruby pipping}
+delays = flights.
+           group_by(:dest).
+           summarise(
+             count: E.n,
+             dist: :distance.mean(na__rm: true),
+             delay: :arr_delay.mean(na__rm: true)).
+           filter(:count > 20, :dest != "NHL")
+puts delays.head
+```
+# Using Data Table
+```{ruby fread}
+R.library('data.table')
+R.install_and_loads('curl')
+input = "https://raw.githubusercontent.com/Rdatatable/data.table/master/vignettes/flights14.csv"
+flights = R.fread(input)
+puts flights
+puts flights.dim
+```
+```{ruby data_table}
+data_table = R.data__table(
+  ID: R.c("b","b","b","a","a","c"),
+  a: (1..6),
+  b: (7..12),
+  c: (13..18)
+)
+puts data_table
+puts data_table.ID
+```
+```{ruby subset_i}
+# subset rows in i
+ans = flights[(:origin.eq "JFK") & (:month.eq 6)]
+puts ans.head
+# Get the first two rows from flights.
+ans = flights[(1..2)]
+puts ans
+# Sort flights first by column origin in ascending order, and then by dest in descending order:
+# ans = flights[E.order(:origin, -(:dest))]
+# puts ans.head
+```
+```{ruby select_j}
+# Select column(s) in j
+# select arr_delay column, but return it as a vector.
+ans = flights[:all, :arr_delay]
+puts ans.head
+# Select arr_delay column, but return as a data.table instead.
+ans = flights[:all, :arr_delay.list]
+puts ans.head
+ans = flights[:all, E.list(:arr_delay, :dep_delay)]
+```
+# Graphics in Galaaz
+Creating graphics in Galaaz is quite easy, as it can use all the power of ggplot2.  There are
+many resources in the web that teaches ggplot, so here we give a quick example of ggplot
+integration with Ruby.  We continue to use the :mtcars dataset and we will plot a diverging
+bar plot, showing cars that have 'above' or 'below' gas consuption. Let's first prepare
+the data frame with the necessary data:
+```{ruby diverging_plot_pre}
+# copy the R variable :mtcars to the Ruby mtcars variable
+mtcars = ~:mtcars
+# create a new column 'car_name' to store the car names so that it can be
+# used for plotting. The 'rownames' of the data frame cannot be used as
+# data for plotting
+mtcars.car_name = R.rownames(:mtcars)
+# compute normalized mpg and add it to a new column called mpg_z
+# Note that the mean value for mpg can be obtained by calling the 'mean'
+# function on the vector 'mtcars.mpg'.  The same with the standard
+# deviation 'sd'.  The vector is then rounded to two digits with 'round 2'
+mtcars.mpg_z = ((mtcars.mpg - mtcars.mpg.mean)/mtcars.mpg.sd).round 2
+# create a new column 'mpg_type'. Function 'ifelse' is a vectorized function
+# that looks at every element of the mpg_z vector and if the value is below
+# 0, returns 'below', otherwise returns 'above'
+mtcars.mpg_type = (mtcars.mpg_z < 0).ifelse("below", "above")
+# order the mtcar data set by the mpg_z vector from smaler to larger values
+mtcars = mtcars[mtcars.mpg_z.order, :all]
+# convert the car_name column to a factor to retain sorted order in plot
+mtcars.car_name = mtcars.car_name.factor levels: mtcars.car_name
+# let's look at the final data frame
+puts mtcars.head
+```
+Now, lets plot the diverging bar plot.  When using gKnit, there is no need to call
+'R.awt' to create a plotting device, since gKnit does take care of it. Galaaz
+provides integration with ggplot. The interested reader should check online for more
+information on ggplot, since it is outside the scope of this manual describing
+how ggplot works. We give here but a brief description on how this plot is generated.
+ggplot implements the 'grammar of graphics'. In this approach, plots are build by
+adding layers to the plot.  On the first layer we describe what we want on the 'x'
+and 'y' axis of the plot.  In this case, we have 'car_name' on the 'x' axis and
+'mpg\_z' on the 'y' axis. Then the type of graph is specified by adding
+'geom\_bar' (for a bar graph).  We specify that our bars should be filled using
+'mpg\_type', which is either 'above' or 'bellow' giving then two colours for
+filling. On the next layer we specify the labels for the graph, then we add the
+title and subtitle.  Finally, in a bar chart usually bars go on the vertical direction,
+but in this graph we want the bars to be horizontally layed so we add 'coord\_flip'.
+```{ruby diverging_bar, fig.width = 9.1, fig.height = 6.5}
+require 'ggplot'
+puts mtcars.ggplot(E.aes(x: :car_name, y: :mpg_z, label: :mpg_z)) +
+     R.geom_bar(E.aes(fill: :mpg_type), stat: 'identity', width: 0.5) +
+     R.scale_fill_manual(name: 'Mileage',
+                         labels: R.c('Above Average', 'Below Average'),
+                         values: R.c('above': '#00ba38', 'below': '#f8766d')) +
+     R.labs(subtitle: "Normalised mileage from 'mtcars'",
+            title: "Diverging Bars") +
+     R.coord_flip
+```
+# Coding with Tidyverse
+In R, and when coding with 'tidyverse', arguments to a function are usually not
+*referencially transparent*. That is, you can’t replace a value with a seemingly equivalent
+object that you’ve defined elsewhere. To see the problem, let's first define a data frame:
+```{ruby df}
+df = R.data__frame(x: (1..3), y: (3..1))
+puts df
+```
+and now, let's look at this code:
+```{r not_transp, eval=FALSE}
+my_var <- x
+filter(df, my_var == 1)
+```
+It generates the following error: "object 'x' not found.
+However, in Galaaz, arguments are referencially transparent as can be seen by the
+code bellow.  Note initally that 'my_var = :x' will not give the error "object 'x' not found"
+since ':x' is treated as an expression and assigned to my\_var. Then when doing (my\_var.eq 1),
+my\_var is a variable that resolves to ':x' and it becomes equivalent to (:x.eq 1) which is
+what we want.
+```{ruby my_var}
+my_var = :x
+puts df.filter(my_var.eq 1)
+```
+As stated by Hardley
+> dplyr code is ambiguous. Depending on what variables are defined where,
+> filter(df, x == y) could be equivalent to any of:
+```
+df[df$x == df$y, ]
+df[df$x == y, ]
+df[x == df$y, ]
+df[x == y, ]
+```
+In galaaz this ambiguity does not exist, filter(df, x.eq y) is not a valid expression as
+expressions are build with symbols.  In doing filter(df, :x.eq y) we are looking for elements
+of the 'x' column that are equal to a previously defined y variable.  Finally in
+filter(df, :x.eq :y) we are looking for elements in which the 'x' column value is equal to
+the 'y' column value. This can be seen in the following two chunks of code:
+```{ruby disamb1}
+y = 1
+x = 2
+# looking for values where the 'x' column is equal to the 'y' column
+puts df.filter(:x.eq :y)
+```
+```{ruby disamb2}
+# looking for values where the 'x' column is equal to the 'y' variable
+# in this case, the number 1
+puts df.filter(:x.eq y)
+```
+## Writing a function that applies to different data sets
+Let's suppose that we want to write a function that receives as the first argument a data frame
+and as second argument an expression that adds a column to the data frame that is equal to the
+sum of elements in column 'a' plus 'x'.
+Here is the intended behaviour using the 'mutate' function of 'dplyr':
+```
+mutate(df1, y = a + x)
+mutate(df2, y = a + x)
+mutate(df3, y = a + x)
+mutate(df4, y = a + x)
+```
+The naive approach to writing an R function to solve this problem is:
+```
+mutate_y <- function(df) {
+  mutate(df, y = a + x)
+}
+```
+Unfortunately, in R, this function can fail silently if one of the variables isn’t present
+in the data frame, but is present in the global environment.  We will not go through here how
+to solve this problem in R.
+In Galaaz the method mutate_y bellow will work fine and will never fail silently.
+```{ruby mutate_y, warning=FALSE}
+def mutate_y(df)
+  df.mutate(:y.assign :a + :x)
+end
+```
+Here we create a data frame that has only one column named 'x':
+```{ruby data_frame_no_a_column, warning=FALSE}
+df1 = R.data__frame(x: (1..3))
+puts df1
+```
+Note that method mutate_y will fail independetly from the fact that variable 'a' is defined and
+in the scope of the method.  Variable 'a' has no relationship with the symbol ':a' used in the
+definition of 'mutate\_y' above:
+```{ruby call_mutate_y, warning = FALSE}
+a = 10
+mutate_y(df1)
+```
+## Different expressions
+Let's move to the next problem as presented by Hardley where trying to write a function in R
+that will receive two argumens, the first a variable and the second an expression is not trivial.
+Bellow we create a data frame and we want to write a function that groups data by a variable and
+summarises it by an expression:
+```{r diff_expr}
+set.seed(123)
+df <- data.frame(
+  g1 = c(1, 1, 2, 2, 2),
+  g2 = c(1, 2, 1, 2, 1),
+  a = sample(5),
+  b = sample(5)
+)
+as.data.frame(df)
+d2 <- df %>%
+  group_by(g1) %>%
+  summarise(a = mean(a))
+as.data.frame(d2)
+d2 <- df %>%
+  group_by(g2) %>%
+  summarise(a = mean(a))
+as.data.frame(d2)
+```
+As shown by Hardley, one might expect this function to do the trick:
+```{r diff_exp_fnc}
+my_summarise <- function(df, group_var) {
+  df %>%
+    group_by(group_var) %>%
+    summarise(a = mean(a))
+}
+# my_summarise(df, g1)
+#> Error: Column `group_var` is unknown
+```
+In order to solve this problem, coding with dplyr requires the introduction of many new concepts
+and functions such as 'quo', 'quos', 'enquo', 'enquos', '!!' (bang bang), '!!!' (triple bang).
+Again, we'll leave to Hardley the explanation on how to use all those functions.
+Now, let's try to implement the same function in galaaz.  The next code block first prints the
+'df' data frame defined previously in R (to access an R variable from Galaaz, we use the tilda
+operator '~' applied to the R variable name as symbol, i.e., ':df'.
+```{ruby r_dataframe}
+puts ~:df
+```
+We then create the 'my_summarize' method and call it passing the R data frame and
+the group by variable ':g1':
+```{ruby diff_exp_ruby_func}
+def my_summarize(df, group_var)
+  df.group_by(group_var).
+    summarize(a: :a.mean)
+end
+puts my_summarize(:df, :g1)
+```
+It works!!! Well, let's make sure this was not just some coincidence
+```{ruby group_g2}
+puts my_summarize(:df, :g2)
+```
+Great, everything is fine! No magic, no new functions, no complexities, just normal, standard Ruby
+code.  If you've ever done NSE in R, this certainly feels much safer and easy to implement.
+## Different input variables
+In the previous section we've managed to get rid of all NSE formulation for a simple example, but
+does this remain true for more complex examples, or will the Galaaz way prove inpractical for
+more complex code?
+In the next example Hardley proposes us to write a function that given an expression such as 'a'
+or 'a * b', calculates three summaries.  What we want a function that does the same as these R
+statements:
+```
+summarise(df, mean = mean(a), sum = sum(a), n = n())
+#> # A tibble: 1 x 3
+#>    mean   sum     n
+#>   <dbl> <int> <int>
+#> 1     3    15     5
+summarise(df, mean = mean(a * b), sum = sum(a * b), n = n())
+#> # A tibble: 1 x 3
+#>    mean   sum     n
+#>   <dbl> <int> <int>
+#> 1   9    45     5
+```
+Let's try it in galaaz:
+```{ruby summarize_method}
+def my_summarise2(df, expr)
+  df.summarize(
+    mean: E.mean(expr),
+    sum: E.sum(expr),
+    n: E.n
+  )
+end
+puts my_summarise2((~:df), :a)
+puts "\n"
+puts my_summarise2((~:df), :a * :b)
+```
+Once again, there is no need to use any special theory or functions.  The only point to be
+careful about is the use of 'E' to build expressions from functions 'mean', 'sum' and 'n'.
+## Different input and output variable
+Now the next challenge presented by Hardley is to vary the name of the output variables based on
+the received expression.  So, if the input expression is 'a', we want our data frame columns to
+be named 'mean\_a' and 'sum\_a'.  Now, if the input expression is 'b', columns
+should be named 'mean\_b' and 'sum\_b'.
+```
+mutate(df, mean_a = mean(a), sum_a = sum(a))
+#> # A tibble: 5 x 6
+#>      g1    g2     a     b mean_a sum_a
+#>   <dbl> <dbl> <int> <int>  <dbl> <int>
+#> 1     1     1     1     3      3    15
+#> 2     1     2     4     2      3    15
+#> 3     2     1     2     1      3    15
+#> 4     2     2     5     4      3    15
+#> # … with 1 more row
+mutate(df, mean_b = mean(b), sum_b = sum(b))
+#> # A tibble: 5 x 6
+#>      g1    g2     a     b mean_b sum_b
+#>   <dbl> <dbl> <int> <int>  <dbl> <int>
+#> 1     1     1     1     3      3    15
+#> 2     1     2     4     2      3    15
+#> 3     2     1     2     1      3    15
+#> 4     2     2     5     4      3    15
+#> # … with 1 more row
+```
+In order to solve this problem in R, Hardley needs to introduce some more new functions and notations:
+'quo_name' and the ':=' operator from package 'rlang'
+Here is our Ruby code:
+```{ruby name_change}
+def my_mutate(df, expr)
+  mean_name = "mean_#{expr.to_s}"
+  sum_name = "sum_#{expr.to_s}"
+  df.mutate(mean_name => E.mean(expr),
+            sum_name => E.sum(expr))
+end
+puts my_mutate((~:df), :a)
+puts "\n"
+puts my_mutate((~:df), :b)
+```
+It really seems that "Non Standard Evaluation" is actually quite standard in Galaaz! But, you
+might have noticed a small change in the way the arguments to the mutate method were called.
+In a previous example we used df.summarise(mean: E.mean(:a), ...) where the column name was
+followed by a ':' colom.  In this example, we have df.mutate(mean_name => E.mean(expr), ...)
+and variable mean\_name is not followed by ':' but by '=>'.  This is standard Ruby notation.
+[explain....]
+## Capturing multiple variables
+Moving on with new complexities, Hardley proposes us to solve the problem in which the
+summarise function will receive any number of grouping variables.
+This again is quite standard Ruby.  In order to receive an undefined number of paramenters
+the paramenter is preceded by '*':
+```{ruby multiple_vars}
+def my_summarise3(df, *group_vars)
+  df.group_by(*group_vars).
+    summarise(a: E.mean(:a))
+end
+puts my_summarise3((~:df), :g1, :g2)
+```
+## Why does R require NSE and Galaaz does not?
+NSE introduces a number of new concepts, such as 'quoting', 'quasiquotation', 'unquoting' and
+'unquote-splicing', while in Galaaz none of those concepts are needed. What gives?
+R is an extremely flexible language and it has lazy evaluation of parameters. When in R a
+function is called as 'summarise(df, a = b)', the summarise function receives the litteral
+'a = b' parameter and can work with this as if it were a string. In R, it is not clear what
+a and b are, they can be expressions or they can be variables, it is up to the function to
+decide what 'a = b' means.
+In Ruby, there is no lazy evaluation of parameters and 'a' is always a variable and so is 'b'.
+Variables assume their value as soon as they are used, so 'x = a' is immediately evaluate and
+variable 'x' will receive the value of variable 'a' as soon as the Ruby statement is executed.
+Ruby also provides the notion of a symbol; ':a' is a symbol and does not evaluate to anything.
+Galaaz uses Ruby symbols to build expressions that are not bound to anything: ':a.eq :b' is
+clearly an expression and has no relationship whatsoever with the statment 'a = b'. By using
+symbols, variables and expressions all the possible ambiguities that are found in R are
+eliminated in Galaaz.
+The main problem that remains, is that in R, functions are not clearly documented as what type
+of input they are expecting, they might be expecting regular variables or they might be
+expecting expressions and the R function will know how to deal with an input of the form
+'a = b', now for the Ruby developer it might not be immediately clear if it should call the
+function passing the value 'true' if variable 'a' is equal to variable 'b' or if it should
+call the function passing the expression ':a.eq :b'.
+## Advanced dplyr features
+In the blog: Programming with dplyr by using dplyr (https://www.r-bloggers.com/programming-with-dplyr-by-using-dplyr/) Iñaki Úcar shows surprise that some R users are trying to code in dplyr avoiding
+the use of NSE.  For instance he says:
+> Take the example of seplyr. It stands for standard evaluation dplyr, and enables us to
+> program over dplyr without having “to bring in (or study) any deep-theory or
+> heavy-weight tools such as rlang/tidyeval”.
+For me, there isn't really any surprise that users are trying to avoid dplyr deep-theory. R
+users frequently are not programmers and learning to code is already hard business, on top
+of that, having to learn how to 'quote' or 'enquo' or 'quos' or 'enquos' is not necessarily
+a 'piece of cake'. So much so, that 'tidyeval' has some more advanced functions that instead
+of using quoted expressions, uses strings as arguments.
+In the following examples, we show the use of functions 'group\_by\_at', 'summarise\_at' and
+'rename\_at' that receive strings as argument. The data frame used in 'starwars' that describes
+features of characters in the Starwars movies:
+```{ruby starwars}
+puts (~:starwars).head
+```
+The grouped_mean function bellow will receive a grouping variable and calculate summaries for
+the value\_variables given:
+```{r grouped_mean}
+grouped_mean <- function(data, grouping_variables, value_variables) {
+  data %>%
+    group_by_at(grouping_variables) %>%
+    mutate(count = n()) %>%
+    summarise_at(c(value_variables, "count"), mean, na.rm = TRUE) %>%
+    rename_at(value_variables, funs(paste0("mean_", .)))
+    }
+gm = starwars %>%
+   grouped_mean("eye_color", c("mass", "birth_year"))
+as.data.frame(gm)
+```
+The same code with Galaaz, becomes:
+```{ruby advanced_starwars}
+def grouped_mean(data, grouping_variables, value_variables)
+  data.
+    group_by_at(grouping_variables).
+    mutate(count: E.n).
+    summarise_at(E.c(value_variables, "count"), ~:mean, na__rm: true).
+    rename_at(value_variables, E.funs(E.paste0("mean_", value_variables)))
+end
+puts grouped_mean((~:starwars), "eye_color", E.c("mass", "birth_year"))
+```
+[TO BE CONTINUED...]
+# Contributing
 * Fork it
 * Create your feature branch (git checkout -b my-new-feature)
@@ -210,3 +2055,4 @@ puts gg
 * Push to the branch (git push origin my-new-feature)
 * Create new Pull Request
+# References