galaaz 0.4.6 → 0.5.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +5 -5
- data/README.md +3575 -118
- data/Rakefile +21 -4
- data/bin/gknit +152 -6
- data/bin/gknit-draft +105 -0
- data/bin/gknit-draft.rb +28 -0
- data/bin/gknit_Rscript +127 -0
- data/bin/grun +27 -1
- data/bin/gstudio +47 -4
- data/bin/{gstudio.rb → gstudio_irb.rb} +0 -0
- data/bin/gstudio_pry.rb +7 -0
- data/blogs/galaaz_ggplot/galaaz_ggplot.Rmd +3 -12
- data/blogs/galaaz_ggplot/galaaz_ggplot.html +77 -222
- data/blogs/galaaz_ggplot/galaaz_ggplot.md +4 -31
- data/blogs/galaaz_ggplot/galaaz_ggplot.pdf +0 -0
- data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-html/midwest_rb.png +0 -0
- data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-html/scatter_plot_rb.png +0 -0
- data/blogs/galaaz_ggplot/midwest.Rmd +1 -9
- data/blogs/gknit/gknit.Rmd +232 -123
- data/blogs/{dev/dev.html → gknit/gknit.html} +1897 -33
- data/blogs/gknit/gknit.pdf +0 -0
- data/blogs/gknit/lst.rds +0 -0
- data/blogs/gknit/stats.bib +27 -0
- data/blogs/manual/lst.rds +0 -0
- data/blogs/manual/manual.Rmd +1893 -47
- data/blogs/manual/manual.html +3153 -347
- data/blogs/manual/manual.md +3575 -118
- data/blogs/manual/manual.pdf +0 -0
- data/blogs/manual/manual.tex +4026 -0
- data/blogs/manual/manual_files/figure-html/bubble-1.png +0 -0
- data/blogs/manual/manual_files/figure-html/diverging_bar.png +0 -0
- data/blogs/manual/manual_files/figure-latex/bubble-1.png +0 -0
- data/blogs/manual/manual_files/figure-latex/diverging_bar.pdf +0 -0
- data/blogs/{dev → manual}/model.rb +0 -0
- data/blogs/nse_dplyr/nse_dplyr.Rmd +849 -0
- data/blogs/nse_dplyr/nse_dplyr.html +878 -0
- data/blogs/nse_dplyr/nse_dplyr.md +1198 -0
- data/blogs/nse_dplyr/nse_dplyr.pdf +0 -0
- data/blogs/oh_my/oh_my.html +274 -386
- data/blogs/oh_my/oh_my.md +208 -205
- data/blogs/ruby_plot/ruby_plot.Rmd +64 -84
- data/blogs/ruby_plot/ruby_plot.html +235 -208
- data/blogs/ruby_plot/ruby_plot.md +239 -34
- data/blogs/ruby_plot/ruby_plot.pdf +0 -0
- data/blogs/ruby_plot/ruby_plot_files/figure-html/dose_len.png +0 -0
- data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_delivery.png +0 -0
- data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_dose.png +0 -0
- data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color.png +0 -0
- data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color2.png +0 -0
- data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_decorations.png +0 -0
- data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_jitter.png +0 -0
- data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_points.png +0 -0
- data/blogs/ruby_plot/ruby_plot_files/figure-html/final_box_plot.png +0 -0
- data/blogs/ruby_plot/ruby_plot_files/figure-html/final_violin_plot.png +0 -0
- data/blogs/ruby_plot/ruby_plot_files/figure-html/violin_with_jitter.png +0 -0
- data/examples/Bibliography/master.bib +50 -0
- data/examples/Bibliography/stats.bib +72 -0
- data/examples/islr/ch2.spec.rb +1 -1
- data/examples/islr/ch3_boston.rb +4 -4
- data/examples/islr/x_y_rnorm.jpg +0 -0
- data/examples/latex_templates/Test-acm_article/Makefile +16 -0
- data/examples/latex_templates/Test-acm_article/Test-acm_article.Rmd +65 -0
- data/examples/latex_templates/Test-acm_article/acm_proc_article-sp.cls +1670 -0
- data/examples/latex_templates/Test-acm_article/sensys-abstract.cls +703 -0
- data/examples/latex_templates/Test-acm_article/sigproc.bib +59 -0
- data/examples/latex_templates/Test-acs_article/Test-acs_article.Rmd +260 -0
- data/examples/latex_templates/Test-acs_article/Test-acs_article.pdf +0 -0
- data/examples/latex_templates/Test-acs_article/acs-Test-acs_article.bib +11 -0
- data/examples/latex_templates/Test-acs_article/acs-my_output.bib +11 -0
- data/examples/latex_templates/Test-acs_article/acstest.bib +17 -0
- data/examples/latex_templates/Test-aea_article/AEA.cls +1414 -0
- data/examples/latex_templates/Test-aea_article/BibFile.bib +0 -0
- data/examples/latex_templates/Test-aea_article/Test-aea_article.Rmd +108 -0
- data/examples/latex_templates/Test-aea_article/Test-aea_article.pdf +0 -0
- data/examples/latex_templates/Test-aea_article/aea.bst +1269 -0
- data/examples/latex_templates/Test-aea_article/multicol.sty +853 -0
- data/examples/latex_templates/Test-aea_article/references.bib +0 -0
- data/examples/latex_templates/Test-aea_article/setspace.sty +546 -0
- data/examples/latex_templates/Test-amq_article/Test-amq_article.Rmd +256 -0
- data/examples/latex_templates/Test-amq_article/Test-amq_article.pdf +0 -0
- data/examples/latex_templates/Test-amq_article/Test-amq_article.pdfsync +3397 -0
- data/examples/latex_templates/Test-amq_article/pics/Figure2.pdf +0 -0
- data/examples/latex_templates/Test-ams_article/Test-ams_article.Rmd +215 -0
- data/examples/latex_templates/Test-ams_article/amstest.bib +436 -0
- data/examples/latex_templates/Test-asa_article/Test-asa_article.Rmd +153 -0
- data/examples/latex_templates/Test-asa_article/Test-asa_article.pdf +0 -0
- data/examples/latex_templates/Test-asa_article/agsm.bst +1353 -0
- data/examples/latex_templates/Test-asa_article/bibliography.bib +233 -0
- data/examples/latex_templates/Test-ieee_article/IEEEtran.bst +2409 -0
- data/examples/latex_templates/Test-ieee_article/IEEEtran.cls +6346 -0
- data/examples/latex_templates/Test-ieee_article/Test-ieee_article.Rmd +175 -0
- data/examples/latex_templates/Test-ieee_article/Test-ieee_article.pdf +0 -0
- data/examples/latex_templates/Test-ieee_article/mybibfile.bib +20 -0
- data/examples/latex_templates/Test-rjournal_article/RJournal.sty +335 -0
- data/examples/latex_templates/Test-rjournal_article/RJreferences.bib +18 -0
- data/examples/latex_templates/Test-rjournal_article/RJwrapper.pdf +0 -0
- data/examples/latex_templates/Test-rjournal_article/Test-rjournal_article.Rmd +52 -0
- data/examples/latex_templates/Test-springer_article/Test-springer_article.Rmd +65 -0
- data/examples/latex_templates/Test-springer_article/Test-springer_article.pdf +0 -0
- data/examples/latex_templates/Test-springer_article/bibliography.bib +26 -0
- data/examples/latex_templates/Test-springer_article/spbasic.bst +1658 -0
- data/examples/latex_templates/Test-springer_article/spmpsci.bst +1512 -0
- data/examples/latex_templates/Test-springer_article/spphys.bst +1443 -0
- data/examples/latex_templates/Test-springer_article/svglov3.clo +113 -0
- data/examples/latex_templates/Test-springer_article/svjour3.cls +1431 -0
- data/examples/misc/moneyball.rb +1 -1
- data/examples/misc/subsetting.rb +37 -37
- data/examples/rmarkdown/svm-rmarkdown-anon-ms-example/svm-rmarkdown-anon-ms-example.Rmd +73 -0
- data/examples/rmarkdown/svm-rmarkdown-anon-ms-example/svm-rmarkdown-anon-ms-example.pdf +0 -0
- data/examples/rmarkdown/svm-rmarkdown-article-example/svm-rmarkdown-article-example.Rmd +382 -0
- data/examples/rmarkdown/svm-rmarkdown-article-example/svm-rmarkdown-article-example.pdf +0 -0
- data/examples/rmarkdown/svm-rmarkdown-beamer-example/svm-rmarkdown-beamer-example.Rmd +164 -0
- data/examples/rmarkdown/svm-rmarkdown-beamer-example/svm-rmarkdown-beamer-example.pdf +0 -0
- data/examples/rmarkdown/svm-rmarkdown-cv/svm-rmarkdown-cv.Rmd +92 -0
- data/examples/rmarkdown/svm-rmarkdown-cv/svm-rmarkdown-cv.pdf +0 -0
- data/examples/rmarkdown/svm-rmarkdown-syllabus-example/attend-grade-relationships.csv +482 -0
- data/examples/rmarkdown/svm-rmarkdown-syllabus-example/svm-rmarkdown-syllabus-example.Rmd +280 -0
- data/examples/rmarkdown/svm-rmarkdown-syllabus-example/svm-rmarkdown-syllabus-example.pdf +0 -0
- data/examples/rmarkdown/svm-xaringan-example/svm-xaringan-example.Rmd +386 -0
- data/lib/R_interface/r.rb +2 -2
- data/lib/R_interface/r_libs.R +6 -1
- data/lib/R_interface/r_methods.rb +12 -2
- data/lib/R_interface/rdata_frame.rb +8 -17
- data/lib/R_interface/rindexed_object.rb +1 -2
- data/lib/R_interface/rlist.rb +1 -0
- data/lib/R_interface/robject.rb +20 -23
- data/lib/R_interface/rpkg.rb +15 -6
- data/lib/R_interface/rsupport.rb +13 -19
- data/lib/R_interface/ruby_extensions.rb +14 -18
- data/lib/R_interface/rvector.rb +0 -12
- data/lib/gknit.rb +2 -0
- data/lib/gknit/draft.rb +105 -0
- data/lib/gknit/knitr_engine.rb +6 -37
- data/lib/util/exec_ruby.rb +22 -84
- data/lib/util/inline_file.rb +7 -3
- data/specs/figures/bg.jpeg +0 -0
- data/specs/figures/bg.png +0 -0
- data/specs/figures/bg.svg +2 -2
- data/specs/figures/dose_len.png +0 -0
- data/specs/figures/no_args.jpeg +0 -0
- data/specs/figures/no_args.png +0 -0
- data/specs/figures/no_args.svg +2 -2
- data/specs/figures/width_height.jpeg +0 -0
- data/specs/figures/width_height.png +0 -0
- data/specs/figures/width_height_units1.jpeg +0 -0
- data/specs/figures/width_height_units1.png +0 -0
- data/specs/figures/width_height_units2.jpeg +0 -0
- data/specs/figures/width_height_units2.png +0 -0
- data/specs/r_dataframe.spec.rb +184 -11
- data/specs/r_list.spec.rb +4 -4
- data/specs/r_list_apply.spec.rb +11 -10
- data/specs/ruby_expression.spec.rb +3 -11
- data/specs/tmp.rb +106 -34
- data/version.rb +1 -1
- metadata +96 -33
- data/bin/gknit_old_r +0 -236
- data/blogs/dev/dev.Rmd +0 -77
- data/blogs/dev/dev.md +0 -87
- data/blogs/dev/dev_files/figure-html/bubble-1.png +0 -0
- data/blogs/dev/dev_files/figure-html/diverging_bar. +0 -0
- data/blogs/dev/dev_files/figure-html/diverging_bar.png +0 -0
- data/blogs/dplyr/dplyr.rb +0 -63
- data/blogs/galaaz_ggplot/galaaz_ggplot.aux +0 -43
- data/blogs/galaaz_ggplot/galaaz_ggplot.log +0 -640
- data/blogs/galaaz_ggplot/galaaz_ggplot.out +0 -10
- data/blogs/galaaz_ggplot/galaaz_ggplot.tex +0 -481
- data/blogs/galaaz_ggplot/midwest.png +0 -0
- data/blogs/galaaz_ggplot/scatter_plot.png +0 -0
- data/blogs/ruby_plot/ruby_plot.Rmd_external_figs +0 -662
- data/blogs/ruby_plot/ruby_plot.tex +0 -1077
- data/blogs/ruby_plot/ruby_plot_files/figure-html/dose_len.svg +0 -57
- data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_delivery.svg +0 -106
- data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_dose.svg +0 -110
- data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color.svg +0 -174
- data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color2.svg +0 -236
- data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_jitter.svg +0 -296
- data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_points.svg +0 -236
- data/blogs/ruby_plot/ruby_plot_files/figure-html/final_box_plot.svg +0 -218
- data/blogs/ruby_plot/ruby_plot_files/figure-html/final_violin_plot.svg +0 -128
- data/blogs/ruby_plot/ruby_plot_files/figure-html/violin_with_jitter.svg +0 -150
- data/examples/paper/paper.rb +0 -36
@@ -151,10 +151,6 @@ R.options(scipen: 999) # turn-off scientific notation like 1e+48
|
|
151
151
|
R.theme_set(R.theme_bw) # pre-set the bw theme.
|
152
152
|
|
153
153
|
midwest = ~:midwest
|
154
|
-
# midwest <- read.csv("http://goo.gl/G1K41K") # bkup data source
|
155
|
-
|
156
|
-
# R.awt # run the awt device if the plot should show on the screen
|
157
|
-
R.svg # run the svg device if an image should be generated
|
158
154
|
|
159
155
|
# Scatterplot
|
160
156
|
gg = midwest.ggplot(E.aes(x: :area, y: :poptotal)) +
|
@@ -168,19 +164,12 @@ gg = midwest.ggplot(E.aes(x: :area, y: :poptotal)) +
|
|
168
164
|
title: "Scatterplot",
|
169
165
|
caption: "Source: midwest")
|
170
166
|
|
171
|
-
R.png('midwest.png') # this line is not necessary with the awt device
|
172
167
|
puts gg
|
173
|
-
|
174
|
-
R.dev__off # R.dev__off turns off the device. If using awt, the plot
|
175
|
-
# window will be closed
|
176
168
|
```
|
177
169
|
|
178
|
-
```
|
179
|
-
## This is the fake output
|
180
|
-
```
|
181
170
|
|
171
|
+
![](galaaz_ggplot_files/figure-html/midwest_rb.png)<!-- -->
|
182
172
|
|
183
|
-
![Midwest Plot](midwest.png){width=70%}
|
184
173
|
|
185
174
|
In R, the code to generate this plot is the following
|
186
175
|
|
@@ -279,10 +268,6 @@ module CorpTheme
|
|
279
268
|
end
|
280
269
|
```
|
281
270
|
|
282
|
-
```
|
283
|
-
## This is the fake output
|
284
|
-
```
|
285
|
-
|
286
271
|
We now define a ScatterPlot class:
|
287
272
|
|
288
273
|
|
@@ -358,9 +343,7 @@ class ScatterPlot
|
|
358
343
|
# Plots the scatterplot
|
359
344
|
#---------------------------------------------------------------------------------
|
360
345
|
|
361
|
-
def plot
|
362
|
-
device == 'awt' ? R.awt : R.svg
|
363
|
-
|
346
|
+
def plot
|
364
347
|
gg = @data.ggplot(E.aes(x: @x, y: @y)) +
|
365
348
|
points +
|
366
349
|
R.geom_smooth(method: @method, se: @confidence) +
|
@@ -373,19 +356,13 @@ class ScatterPlot
|
|
373
356
|
caption: @caption) +
|
374
357
|
CorpTheme.global_theme
|
375
358
|
|
376
|
-
R.png('scatter_plot.png') if !(device == 'awt')
|
377
359
|
puts gg
|
378
|
-
R.dev__off
|
379
360
|
|
380
361
|
end
|
381
362
|
|
382
363
|
end
|
383
364
|
```
|
384
365
|
|
385
|
-
```
|
386
|
-
## This is the fake output
|
387
|
-
```
|
388
|
-
|
389
366
|
And this is the final code for making the scatter plot with the midwest data
|
390
367
|
|
391
368
|
|
@@ -402,15 +379,11 @@ sp.y_label = "Population"
|
|
402
379
|
sp.group_by(color: :state, size: :popdensity) # try sp.group_by(color: :state)
|
403
380
|
# available methods: "lm", "glm", "loess", "gam"
|
404
381
|
sp.add_smoothing_line(method: "glm")
|
405
|
-
|
406
|
-
puts sp
|
382
|
+
sp.plot
|
407
383
|
```
|
408
384
|
|
409
|
-
```
|
410
|
-
## This is the fake output
|
411
|
-
```
|
412
385
|
|
413
|
-
![
|
386
|
+
![](galaaz_ggplot_files/figure-html/scatter_plot_rb.png)<!-- -->
|
414
387
|
|
415
388
|
# Conclusion
|
416
389
|
|
Binary file
|
Binary file
|
@@ -4,7 +4,7 @@ screen, bellow, we generate an 'svg' image and then include it in this document.
|
|
4
4
|
generate and image, the R.svg device is used. To generate the plot on the screen, use the R.awt
|
5
5
|
device, as commented on the code.
|
6
6
|
|
7
|
-
```{ruby midwest_rb, warning=FALSE}
|
7
|
+
```{ruby midwest_rb, warning=FALSE, fig.width = 9.1, fig.height = 9.1}
|
8
8
|
require 'galaaz'
|
9
9
|
require 'ggplot'
|
10
10
|
|
@@ -13,10 +13,6 @@ R.options(scipen: 999) # turn-off scientific notation like 1e+48
|
|
13
13
|
R.theme_set(R.theme_bw) # pre-set the bw theme.
|
14
14
|
|
15
15
|
midwest = ~:midwest
|
16
|
-
# midwest <- read.csv("http://goo.gl/G1K41K") # bkup data source
|
17
|
-
|
18
|
-
# R.awt # run the awt device if the plot should show on the screen
|
19
|
-
R.svg # run the svg device if an image should be generated
|
20
16
|
|
21
17
|
# Scatterplot
|
22
18
|
gg = midwest.ggplot(E.aes(x: :area, y: :poptotal)) +
|
@@ -30,10 +26,6 @@ gg = midwest.ggplot(E.aes(x: :area, y: :poptotal)) +
|
|
30
26
|
title: "Scatterplot",
|
31
27
|
caption: "Source: midwest")
|
32
28
|
|
33
|
-
R.png('midwest.png') # this line is not necessary with the awt device
|
34
29
|
puts gg
|
35
|
-
|
36
|
-
R.dev__off # R.dev__off turns off the device. If using awt, the plot
|
37
|
-
# window will be closed
|
38
30
|
```
|
39
31
|
|
data/blogs/gknit/gknit.Rmd
CHANGED
@@ -4,7 +4,8 @@ author:
|
|
4
4
|
- "Rodrigo Botafogo"
|
5
5
|
- "Daniel Mossé - University of Pittsburgh"
|
6
6
|
tags: [Tech, Data Science, Ruby, R, GraalVM]
|
7
|
-
date: "
|
7
|
+
date: "29/04/2019"
|
8
|
+
bibliography: stats.bib
|
8
9
|
output:
|
9
10
|
pdf_document:
|
10
11
|
includes:
|
@@ -13,6 +14,7 @@ output:
|
|
13
14
|
html_document:
|
14
15
|
self_contained: true
|
15
16
|
keep_md: true
|
17
|
+
biblio-style: apsr
|
16
18
|
---
|
17
19
|
|
18
20
|
```{r setup, echo=FALSE}
|
@@ -21,7 +23,8 @@ output:
|
|
21
23
|
|
22
24
|
# Introduction
|
23
25
|
|
24
|
-
The idea of "literate programming" was first introduced by Donald Knuth in the
|
26
|
+
The idea of "literate programming" was first introduced by Donald Knuth in the
|
27
|
+
1980's [@Knuth:literate_programming].
|
25
28
|
The main intention of this approach was to develop software interspersing macro snippets,
|
26
29
|
traditional source code, and a natural language such as English in a document
|
27
30
|
that could be compiled into
|
@@ -37,19 +40,21 @@ single document or set of documents that when distributed to peers could be reru
|
|
37
40
|
the same output and reports.
|
38
41
|
|
39
42
|
The R community has put a great deal of effort in reproducible research. In 2002, Sweave was
|
40
|
-
introduced and it allowed mixing R code with Latex generating high quality PDF documents.
|
41
|
-
|
42
|
-
contained the whole narrative to reproduce the research.
|
43
|
-
2012, Knitr, developed by Yihui Xie from RStudio was released
|
44
|
-
|
43
|
+
introduced and it allowed mixing R code with Latex generating high quality PDF documents. A
|
44
|
+
Sweave document could include code, the results of executing the code, graphics and text
|
45
|
+
such that it contained the whole narrative to reproduce the research. In
|
46
|
+
2012, Knitr, developed by Yihui Xie from RStudio was released to replace Sweave and to
|
47
|
+
consolidate in one single package the many extensions and add-on packages that
|
45
48
|
were necessary for Sweave.
|
46
49
|
|
47
|
-
With Knitr,
|
48
|
-
Markdown format. With
|
49
|
-
of formats such as HTML, markdown, Latex, PDF, dvi, etc.
|
50
|
-
multiple programming languages in the same document.
|
50
|
+
With Knitr, __R markdown__ was also developed, an extension to the
|
51
|
+
Markdown format. With __R markdown__ and Knitr it is possible to generate reports in a multitude
|
52
|
+
of formats such as HTML, markdown, Latex, PDF, dvi, etc. __R markdown__ also allows the use of
|
53
|
+
multiple programming languages such as R, Ruby, Python, etc. in the same document.
|
54
|
+
|
55
|
+
In __R markdown__, text is interspersed with
|
51
56
|
code chunks that can be executed and both the code and its results can become
|
52
|
-
part of the final report. Although
|
57
|
+
part of the final report. Although __R markdown__ allows multiple programming languages in the
|
53
58
|
same document, only R and Python (with
|
54
59
|
the reticulate package) can persist variables between chunks. For other languages, such as
|
55
60
|
Ruby, every chunk will start a new process and thus all data is lost between chunks, unless it
|
@@ -57,46 +62,76 @@ is somehow stored in a data file that is read by the next chunk.
|
|
57
62
|
|
58
63
|
Being able to persist data
|
59
64
|
between chunks is critical for literate programming otherwise the flow of the narrative is lost
|
60
|
-
by all the effort of having to save data and then reload it.
|
61
|
-
|
62
|
-
|
63
|
-
|
64
|
-
|
65
|
+
by all the effort of having to save data and then reload it. Although this might, at first, seem like
|
66
|
+
a small nuisance, not being able to persist data between chunks is a major issue. For example, let's
|
67
|
+
take a look at the following simple example in which we want to show how to create a list and the
|
68
|
+
use it. Let's first assume that data cannot be persisted between chunks. In the next chunk we
|
69
|
+
create a list, then we would need to save it to file, but to save it, we need somehow to marshal the
|
70
|
+
data into a binary format:
|
71
|
+
|
72
|
+
```{ruby no_persistence}
|
73
|
+
lst = R.list(a: 1, b: 2, c: 3)
|
74
|
+
lst.saveRDS("lst.rds")
|
75
|
+
```
|
76
|
+
then, on the next chunk, where variable 'lst' is used, we need to read back it's value
|
77
|
+
|
78
|
+
```{ruby load_persisted_data}
|
79
|
+
lst = R.readRDS("lst.rds")
|
80
|
+
puts lst
|
81
|
+
```
|
82
|
+
|
83
|
+
Now, any single code has dozens of variables that we might want to use and reuse between chunks.
|
84
|
+
Clearly, such an approach becomes quickly unmanageable. Probably, because of
|
85
|
+
this problem, it is very rare to see any __R markdown__ document in the Ruby community.
|
86
|
+
|
87
|
+
When variables can be used accross chunks, then no overhead is needed:
|
88
|
+
|
89
|
+
```{ruby persistence}
|
90
|
+
lst = R.list(a: 1, b: 2, c: 3)
|
91
|
+
# any other code can be added here
|
92
|
+
```
|
93
|
+
|
94
|
+
```{ruby use_var}
|
95
|
+
puts lst
|
96
|
+
```
|
65
97
|
|
66
98
|
In the Python community, the same effort to have code and text in an integrated environment
|
67
99
|
started around the first decade of 2000. In 2006 iPython 0.7.2 was released. In 2014,
|
68
100
|
Fernando Pérez, spun off project Jupyter from iPython creating a web-based interactive
|
69
101
|
computation environment. Jupyter can now be used with many languages, including Ruby with the
|
70
|
-
iruby gem (https://github.com/SciRuby/iruby).
|
71
|
-
|
102
|
+
iruby gem (https://github.com/SciRuby/iruby). In order to have multiple languages in a Jupyter
|
103
|
+
notebook the SoS kernel was developed (https://vatlab.github.io/sos-docs/).
|
72
104
|
|
73
105
|
# gKnitting a Document
|
74
106
|
|
75
|
-
This document describes gKnit. gKnit
|
76
|
-
and output it in any of the available formats
|
107
|
+
This document describes gKnit. gKnit is based on knitr and __R markdown__ and can knit a document
|
108
|
+
written both in Ruby and/or R and output it in any of the available formats of __R markdown__. gKnit
|
109
|
+
allows ruby developers to do literate programming and reproducible research by allowing them to
|
110
|
+
have in a single document, text and code.
|
111
|
+
|
77
112
|
gKnit runs atop of GraalVM, and Galaaz (an integration
|
78
|
-
library between Ruby and R). In gKnit, Ruby variables are persisted between
|
79
|
-
it an ideal solution for literate programming in this language. Also,
|
80
|
-
Galaaz, Ruby chunks can have access to R variables and Polyglot Programming
|
81
|
-
is quite natural.
|
113
|
+
library between Ruby and R - see bellow). In gKnit, Ruby variables are persisted between
|
114
|
+
chunks, making it an ideal solution for literate programming in this language. Also,
|
115
|
+
since it is based on Galaaz, Ruby chunks can have access to R variables and Polyglot Programming
|
116
|
+
with Ruby and R is quite natural.
|
82
117
|
|
83
|
-
Galaaz has been describe
|
118
|
+
Galaaz has already been describe in the following posts:
|
84
119
|
|
85
120
|
* https://towardsdatascience.com/ruby-plotting-with-galaaz-an-example-of-tightly-coupling-ruby-and-r-in-graalvm-520b69e21021.
|
86
121
|
* https://medium.freecodecamp.org/how-to-make-beautiful-ruby-plots-with-galaaz-320848058857
|
87
122
|
|
88
|
-
This is not a blog post on
|
123
|
+
This is not a blog post on __R markdown__, and the interested user is directed to the following links
|
89
124
|
for detailed information on its capabilities and use.
|
90
125
|
|
91
126
|
* https://rmarkdown.rstudio.com/ or
|
92
127
|
* https://bookdown.org/yihui/rmarkdown/
|
93
128
|
|
94
|
-
|
95
|
-
Ruby and R documents quickly.
|
129
|
+
In this post, we will describe just the main aspects of __R markdown__, so the user can start
|
130
|
+
gKnitting Ruby and R documents quickly.
|
96
131
|
|
97
132
|
## The Yaml header
|
98
133
|
|
99
|
-
An
|
134
|
+
An __R markdown__ document should start with a Yaml header and be stored in a file with
|
100
135
|
'.Rmd' extension. This document has the following header for gKitting an HTML document.
|
101
136
|
|
102
137
|
```
|
@@ -120,7 +155,7 @@ output:
|
|
120
155
|
|
121
156
|
For more information on the options in the Yaml header, check https://bookdown.org/yihui/rmarkdown/html-document.html.
|
122
157
|
|
123
|
-
##
|
158
|
+
## __R Markdown__ formatting
|
124
159
|
|
125
160
|
Document formatting can be done with simple markups such as:
|
126
161
|
|
@@ -156,7 +191,7 @@ Ordered Lists
|
|
156
191
|
+ Item 3b
|
157
192
|
```
|
158
193
|
|
159
|
-
|
194
|
+
For more R markdown formatting go to https://rmarkdown.rstudio.com/authoring_basics.html.
|
160
195
|
|
161
196
|
### R chunks
|
162
197
|
|
@@ -172,8 +207,7 @@ any optional chunk_label and options, as shown bellow:
|
|
172
207
|
````
|
173
208
|
|
174
209
|
for instance, let's add an R chunk to the document labeled 'first_r_chunk'. This is
|
175
|
-
a very simple code just to create a variable and print it out
|
176
|
-
be defined as follows:
|
210
|
+
a very simple code just to create a variable and print it out, as follows:
|
177
211
|
|
178
212
|
````
|
179
213
|
```{r first_r_chunk}`r ''`
|
@@ -182,7 +216,7 @@ print(vec)
|
|
182
216
|
```
|
183
217
|
````
|
184
218
|
|
185
|
-
If this block is added to an
|
219
|
+
If this block is added to an __R markdown__ document and gKnitted the result will be:
|
186
220
|
|
187
221
|
```{r first_r_chunk}
|
188
222
|
vec <- c(1, 2, 3)
|
@@ -208,9 +242,9 @@ vec3 <- vec * vec2
|
|
208
242
|
print(vec3)
|
209
243
|
```
|
210
244
|
|
211
|
-
A description of the available chunk options can be found in
|
245
|
+
A description of the available chunk options can be found in https://yihui.name/knitr/.
|
212
246
|
|
213
|
-
Let's add another R
|
247
|
+
Let's add another R chunk with a function definition. In this example, a vector
|
214
248
|
'r_vec' is created and
|
215
249
|
a new function 'reduce_sum' is defined. The chunk specification is
|
216
250
|
|
@@ -224,8 +258,8 @@ reduce_sum <- function(...) {
|
|
224
258
|
```
|
225
259
|
````
|
226
260
|
|
227
|
-
and this is how it will look like once executed. From now on,
|
228
|
-
show
|
261
|
+
and this is how it will look like once executed. From now on, to be concise in the
|
262
|
+
presentation we will not show chunk definitions any longer.
|
229
263
|
|
230
264
|
|
231
265
|
```{r data_creation}
|
@@ -249,6 +283,24 @@ this document. Note that there is no directive in the code to include the image
|
|
249
283
|
occurs automatically. The 'mpg' dataframe is natively available to R and to Galaaz as
|
250
284
|
well.
|
251
285
|
|
286
|
+
For the reader not knowledgeable of ggplot, ggplot is a graphics library based on "the
|
287
|
+
grammar of graphics" [@Wilkinson:grammar_of_graphics]. The idea of the grammar of graphics
|
288
|
+
is to build a graphics by adding layers to the plot. More information can be found in
|
289
|
+
https://towardsdatascience.com/a-comprehensive-guide-to-the-grammar-of-graphics-for-effective-visualization-of-multi-dimensional-1f92b4ed4149.
|
290
|
+
|
291
|
+
In the plot bellow the 'mpg' dataset from base R is used. "The data concerns city-cycle fuel
|
292
|
+
consumption in miles per gallon, to be predicted in terms of 3 multivalued discrete and 5
|
293
|
+
continuous attributes." (Quinlan, 1993)
|
294
|
+
|
295
|
+
First, the 'mpg' dataset if filtered to extract only cars from the following manumactures: Audi, Ford,
|
296
|
+
Honda, and Hyundai and stored in the 'mpg_select' variable. Then, the selected dataframe is passed
|
297
|
+
to the ggplot function specifying in the aesthetic method (aes) that 'displacement' (disp) should
|
298
|
+
be plotted in the 'x' axis and 'city mileage' should be on the 'y' axis. In the 'labs' layer we
|
299
|
+
pass the 'title' and 'subtitle' for the plot. To the basic plot 'g', geom\_jitter is added, that
|
300
|
+
plots cars from the same manufactures with the same color (col=manufactures) and the size of the
|
301
|
+
car point equal its high way consumption (size = hwy). Finally, a last layer is plotter containing
|
302
|
+
a linear regression line (method = "lm") for every manufacturer.
|
303
|
+
|
252
304
|
```{r bubble, dev='png'}
|
253
305
|
# load package and data
|
254
306
|
library(ggplot2)
|
@@ -262,9 +314,8 @@ g <- ggplot(mpg_select, aes(displ, cty)) +
|
|
262
314
|
labs(subtitle="mpg: Displacement vs City Mileage",
|
263
315
|
title="Bubble chart")
|
264
316
|
|
265
|
-
g
|
266
|
-
|
267
|
-
|
317
|
+
g + geom_jitter(aes(col=manufacturer, size=hwy)) +
|
318
|
+
geom_smooth(aes(col=manufacturer), method="lm", se=F)
|
268
319
|
```
|
269
320
|
|
270
321
|
### Ruby chunks
|
@@ -282,37 +333,37 @@ available to R chunks. Future versions will add those options.
|
|
282
333
|
|
283
334
|
In this example, the ruby chunk is called 'first_ruby_chunk'. One important
|
284
335
|
aspect of chunk labels is that they cannot be duplicated. If a chunk label is
|
285
|
-
duplicated,
|
336
|
+
duplicated, gKnit will stop with an error.
|
337
|
+
|
338
|
+
In the following chunk, variable 'a', 'b' and 'c' are standard Ruby variables
|
339
|
+
and 'vec' and 'vec2' are two vectors created by calling the 'c' method on the
|
340
|
+
R module.
|
286
341
|
|
287
|
-
|
288
|
-
|
289
|
-
available between chunks, they should be made as instance variables of the
|
290
|
-
RubyChunk class. In the following chunk, variable '\@a', '\@b' and '\@c'
|
291
|
-
are standard Ruby variables and '\@vec' and '\@vec2' are two vectors created
|
292
|
-
by calling the 'c' method on the R module.
|
342
|
+
In Galaaz, the R module allows us to access R functions transparently. The 'c'
|
343
|
+
function in R, is a function that concatenates its arguments making a vector.
|
293
344
|
|
294
|
-
|
345
|
+
It
|
295
346
|
should be clear that there is no requirement in gknit to call or use any R
|
296
347
|
functions. gKnit will knit standard Ruby code, or even general text without
|
297
348
|
any code.
|
298
349
|
|
299
350
|
```{ruby split_data}
|
300
|
-
|
301
|
-
|
302
|
-
|
351
|
+
a = [1, 2, 3]
|
352
|
+
b = "US$ 250.000"
|
353
|
+
c = "The 'outputs' function"
|
303
354
|
|
304
|
-
|
305
|
-
|
355
|
+
vec = R.c(1, 2, 3)
|
356
|
+
vec2 = R.c(10, 20, 30)
|
306
357
|
```
|
307
358
|
|
308
|
-
In
|
359
|
+
In the next block, variables 'a', 'vec' and 'vec2' are used and printed.
|
309
360
|
|
310
361
|
```{ruby split2}
|
311
|
-
puts
|
312
|
-
puts
|
362
|
+
puts a
|
363
|
+
puts vec * vec2
|
313
364
|
```
|
314
365
|
|
315
|
-
Note that
|
366
|
+
Note that 'a' is a standard Ruby Array and 'vec' and 'vec2' are vectors that behave accordingly,
|
316
367
|
where multiplication works as expected.
|
317
368
|
|
318
369
|
|
@@ -338,18 +389,57 @@ puts R.reduce_sum(~:r_vec)
|
|
338
389
|
### Ruby Plotting
|
339
390
|
|
340
391
|
We have seen an example of plotting with R. Plotting with Ruby does not require
|
341
|
-
anything different from plotting with R
|
392
|
+
anything different from plotting with R. In the following example, we plot a
|
393
|
+
diverging bar graph using the 'mtcars' dataframe from R. This data was extracted
|
394
|
+
from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects
|
395
|
+
of automobile design and performance for 32 automobiles (1973–74 models). The
|
396
|
+
ten aspects are:
|
397
|
+
|
398
|
+
* mpg: Miles/(US) gallon
|
399
|
+
* cyl: Number of cylinders
|
400
|
+
* disp: Displacement (cu.in.)
|
401
|
+
* hp: Gross horsepower
|
402
|
+
* drat: Rear axle ratio
|
403
|
+
* wt: Weight (1000 lbs)
|
404
|
+
* qsec: 1/4 mile time
|
405
|
+
* vs: Engine (0 = V-shaped, 1 = straight)
|
406
|
+
* am: Transmission (0 = automatic, 1 = manual)
|
407
|
+
* gear: Number of forward gears
|
408
|
+
* carb: Number of carburetors
|
409
|
+
|
410
|
+
|
411
|
+
```{ruby diverging_plot_pre}
|
412
|
+
# copy the R variable :mtcars to the Ruby mtcars variable
|
413
|
+
mtcars = ~:mtcars
|
342
414
|
|
343
|
-
|
344
|
-
|
415
|
+
# create a new column 'car_name' to store the car names so that it can be
|
416
|
+
# used for plotting. The 'rownames' of the data frame cannot be used as
|
417
|
+
# data for plotting
|
418
|
+
mtcars.car_name = R.rownames(:mtcars)
|
345
419
|
|
346
|
-
|
420
|
+
# compute normalized mpg and add it to a new column called mpg_z
|
421
|
+
# Note that the mean value for mpg can be obtained by calling the 'mean'
|
422
|
+
# function on the vector 'mtcars.mpg'. The same with the standard
|
423
|
+
# deviation 'sd'. The vector is then rounded to two digits with 'round 2'
|
424
|
+
mtcars.mpg_z = ((mtcars.mpg - mtcars.mpg.mean)/mtcars.mpg.sd).round 2
|
425
|
+
|
426
|
+
# create a new column 'mpg_type'. Function 'ifelse' is a vectorized function
|
427
|
+
# that looks at every element of the mpg_z vector and if the value is below
|
428
|
+
# 0, returns 'below', otherwise returns 'above'
|
429
|
+
mtcars.mpg_type = (mtcars.mpg_z < 0).ifelse("below", "above")
|
347
430
|
|
348
|
-
|
349
|
-
mtcars.mpg_z = ((mtcars.mpg - mtcars.mpg.mean) / mtcars.mpg.sd).round 2
|
350
|
-
mtcars.mpg_type = (mtcars.mpg_z < 0).ifelse('below', 'above')
|
431
|
+
# order the mtcar data set by the mpg_z vector from smaler to larger values
|
351
432
|
mtcars = mtcars[mtcars.mpg_z.order, :all]
|
352
|
-
|
433
|
+
|
434
|
+
# convert the car_name column to a factor to retain sorted order in plot
|
435
|
+
mtcars.car_name = mtcars.car_name.factor levels: mtcars.car_name
|
436
|
+
|
437
|
+
# let's look at the first records of the final data frame
|
438
|
+
puts mtcars.head
|
439
|
+
```
|
440
|
+
|
441
|
+
```{ruby diverging_bar, fig.width = 9.1, fig.height = 6.5}
|
442
|
+
require 'ggplot'
|
353
443
|
|
354
444
|
puts mtcars.ggplot(E.aes(x: :car_name, y: :mpg_z, label: :mpg_z)) +
|
355
445
|
R.geom_bar(E.aes(fill: :mpg_type), stat: 'identity', width: 0.5) +
|
@@ -363,69 +453,70 @@ puts mtcars.ggplot(E.aes(x: :car_name, y: :mpg_z, label: :mpg_z)) +
|
|
363
453
|
|
364
454
|
### Inline Ruby code
|
365
455
|
|
366
|
-
When using a Ruby chunk, the code and the output are
|
367
|
-
This formatting is not always desired. Sometimes,
|
368
|
-
Ruby
|
369
|
-
with the 'rb' engine. The following
|
456
|
+
When using a Ruby chunk, the code and the output are formatted in blocks as seen above.
|
457
|
+
This formatting is not always desired. Sometimes, we want to have the results of the
|
458
|
+
Ruby evaluation included in the middle of a phrase. gKnit allows adding inline Ruby code
|
459
|
+
with the 'rb' engine. The following chunk specification will
|
370
460
|
create and inline Ruby text:
|
371
461
|
|
372
462
|
````
|
373
|
-
This is some text with inline Ruby accessing variable
|
374
|
-
```{rb puts "```{rb puts
|
463
|
+
This is some text with inline Ruby accessing variable 'b' which has value:
|
464
|
+
```{rb puts "```{rb puts b}\n```"}
|
375
465
|
```
|
376
466
|
and is followed by some other text!
|
377
467
|
````
|
378
468
|
|
379
|
-
Note that it is important not to add any new line before of after the code
|
380
|
-
block if we want everything to be in only one line, resulting in the following sentence
|
381
|
-
with inline Ruby code
|
382
|
-
|
383
469
|
<div style="margin-bottom:30px;">
|
384
470
|
</div>
|
385
471
|
|
386
|
-
This is some text with inline Ruby accessing variable
|
387
|
-
```{rb puts
|
472
|
+
This is some text with inline Ruby accessing variable 'b' which has value:
|
473
|
+
```{rb puts b}
|
388
474
|
```
|
389
475
|
and is followed by some other text!
|
390
476
|
|
391
477
|
<div style="margin-bottom:30px;">
|
392
478
|
</div>
|
393
479
|
|
480
|
+
Note that it is important not to add any new line before of after the code
|
481
|
+
block if we want everything to be in only one line, resulting in the following sentence
|
482
|
+
with inline Ruby code.
|
483
|
+
|
394
484
|
|
395
485
|
```{ruby heading, echo = FALSE}
|
396
|
-
outputs "### #{
|
486
|
+
outputs "### #{c}"
|
397
487
|
```
|
398
488
|
|
399
|
-
He have previously used the standard 'puts' method in Ruby chunks in order
|
400
|
-
output.
|
489
|
+
He have previously used the standard 'puts' method in Ruby chunks in order produce
|
490
|
+
output. The result of a 'puts', as seen in all previous chunks that use it, is formatted
|
491
|
+
inside a white box that
|
401
492
|
follows the code block. Many times however, we would like to do some processing in the
|
402
493
|
Ruby chunk and have the result of this processing generate and output that is
|
403
|
-
|
494
|
+
"included" in the document as if we had typed it in __R markdown__ document.
|
404
495
|
|
405
|
-
For example, suppose we want to create a new
|
496
|
+
For example, suppose we want to create a new heading in our document, but the heading
|
406
497
|
phrase is the result of some code processing: maybe it's the first line of a file we are
|
407
|
-
going to read. Method 'outputs' adds its output as if typed in the
|
498
|
+
going to read. Method 'outputs' adds its output as if typed in the __R markdown__ document.
|
408
499
|
|
409
|
-
Take now a look at variable '
|
410
|
-
'
|
500
|
+
Take now a look at variable 'c' (it was defined in a previous block above) as
|
501
|
+
'c = "The 'outputs' function". "The 'outputs' function" is actually the name of this
|
411
502
|
section and it was created using the 'outputs' function inside a Ruby chunk.
|
412
503
|
|
413
504
|
The ruby chunk to generate this heading is:
|
414
505
|
|
415
506
|
````
|
416
507
|
```{ruby heading}`r ''`
|
417
|
-
outputs "### #{
|
508
|
+
outputs "### #{c}"
|
418
509
|
```
|
419
510
|
````
|
420
511
|
|
421
|
-
The three '###'
|
512
|
+
The three '###' is the way we add a Heading 3 in __R markdown__.
|
422
513
|
|
423
514
|
|
424
515
|
### HTML Output from Ruby Chunks
|
425
516
|
|
426
|
-
We've just seen the use of method 'outputs' to add text to the the
|
427
|
-
document. This technique can also be used to add HTML code to the document. In
|
428
|
-
|
517
|
+
We've just seen the use of method 'outputs' to add text to the the __R markdown__
|
518
|
+
document. This technique can also be used to add HTML code to the document. In
|
519
|
+
__R markdown__, any html code typed directly in the document will be properly rendered.
|
429
520
|
Here, for instance, is a table definition in HTML and its output in the document:
|
430
521
|
|
431
522
|
```
|
@@ -471,51 +562,52 @@ Here, for instance, is a table definition in HTML and its output in the document
|
|
471
562
|
<div style="margin-bottom:30px;">
|
472
563
|
</div>
|
473
564
|
|
474
|
-
But manually creating HTML output is not always easy or desirable
|
475
|
-
|
565
|
+
But manually creating HTML output is not always easy or desirable, specially
|
566
|
+
if we intend the document to be rendered in other formats, for example, as Latex.
|
567
|
+
Also, The above
|
568
|
+
table looks ugly. The 'kableExtra' library is a great library for
|
476
569
|
creating beautiful tables. Take a look at https://cran.r-project.org/web/packages/kableExtra/vignettes/awesome_table_in_html.html
|
477
570
|
|
478
571
|
In the next chunk, we output the 'mtcars' dataframe from R in a nicely formatted
|
479
572
|
table. Note that we retrieve the mtcars dataframe by using '~:mtcars'.
|
480
573
|
|
481
574
|
```{ruby nice_table}
|
482
|
-
R.
|
575
|
+
R.install_and_loads('kableExtra')
|
483
576
|
outputs (~:mtcars).kable.kable_styling
|
484
577
|
```
|
485
578
|
|
486
|
-
### Including Ruby files
|
579
|
+
### Including Ruby files in a chunk
|
487
580
|
|
488
581
|
R is a language that was created to be easy and fast for statisticians to use. As far
|
489
|
-
as I know
|
582
|
+
as I know, it was not a
|
490
583
|
language to be used for developing large systems. Of course, there are large systems and
|
491
584
|
libraries in R, but the focus of the language is for developing statistical models and
|
492
585
|
distribute that to peers.
|
493
586
|
|
494
587
|
Ruby on the other hand, is a language for large software development. Systems written in
|
495
|
-
Ruby will have dozens, hundreds or even thousands of files.
|
496
|
-
large system with
|
497
|
-
|
498
|
-
|
499
|
-
typed in the '.Rmd' file.
|
588
|
+
Ruby will have dozens, hundreds or even thousands of files. To document a
|
589
|
+
large system with literate programming, we cannot expect the developer to add all the
|
590
|
+
files in a single '.Rmd' file. gKnit provides the 'include' chunk engine to include
|
591
|
+
a Ruby file as if it had being typed in the '.Rmd' file.
|
500
592
|
|
501
593
|
To include a file, the following chunk should be created, where <filename> is the name of
|
502
|
-
the file to be
|
594
|
+
the file to be included and where the extension, if it is '.rb', does not need to be added.
|
503
595
|
If the 'relative' option is not included, then it is treated as TRUE. When 'relative' is
|
504
|
-
true, '
|
505
|
-
is searched to find the file and it is 'require'd.
|
596
|
+
true, ruby's 'require\_relative' semantics is used to load the file, when false, Ruby's
|
597
|
+
\$LOAD_PATH is searched to find the file and it is 'require'd.
|
506
598
|
|
507
599
|
````
|
508
600
|
```{include <filename>, relative = <TRUE/FALSE>}`r ''`
|
509
601
|
```
|
510
602
|
````
|
511
603
|
|
512
|
-
|
604
|
+
Bellow we include file 'model.rb', which is in the same directory of this blog.
|
513
605
|
This code uses R 'caret' package to split a dataset in a train and test sets.
|
514
606
|
The 'caret' package is a very important a useful package for doing Data Analysis,
|
515
607
|
it has hundreds of functions for all steps of the Data Analysis workflow. To
|
516
|
-
just split a dataset
|
517
|
-
it here only to show that integrating Ruby and R and
|
518
|
-
package as 'caret' is trivial with Galaaz.
|
608
|
+
use 'caret' just to split a dataset is like using the proverbial cannon to
|
609
|
+
kill the fly. We use it here only to show that integrating Ruby and R and
|
610
|
+
using even a very complex package as 'caret' is trivial with Galaaz.
|
519
611
|
|
520
612
|
A word of advice: the 'caret' package has lots of dependencies and installing
|
521
613
|
it in a Linux system is a time consuming operation. Method 'R.install_and_loads'
|
@@ -540,10 +632,11 @@ puts model.test.head
|
|
540
632
|
### Documenting Gems
|
541
633
|
|
542
634
|
gKnit also allows developers to document and load files that are not in the same directory
|
543
|
-
of the '.Rmd' file.
|
544
|
-
file in Ruby's \$LOAD_PATH and load it if found.
|
635
|
+
of the '.Rmd' file.
|
545
636
|
|
546
|
-
Here is an example of loading the 'find.rb' file from TruffleRuby.
|
637
|
+
Here is an example of loading the 'find.rb' file from TruffleRuby. In this example, relative
|
638
|
+
is set to FALSE, so Ruby will look for the file in its $LOAD\_PATH, and the user does not
|
639
|
+
need to no it's directory.
|
547
640
|
|
548
641
|
````
|
549
642
|
```{include find, relative = FALSE}`r ''`
|
@@ -556,7 +649,7 @@ Here is an example of loading the 'find.rb' file from TruffleRuby.
|
|
556
649
|
## Converting to PDF
|
557
650
|
|
558
651
|
One of the beauties of knitr is that the same input can be converted to many different outputs.
|
559
|
-
One very useful format, is, of course, PDF. In order to converted an
|
652
|
+
One very useful format, is, of course, PDF. In order to converted an __R markdown__ file to PDF
|
560
653
|
it is necessary to have LaTeX installed on the system. We will not explain here how to
|
561
654
|
install LaTeX as there are plenty of documents on the web showing how to proceed.
|
562
655
|
|
@@ -570,27 +663,39 @@ author: "Rodrigo Botafogo"
|
|
570
663
|
tags: [Galaaz, Ruby, R, TruffleRuby, FastR, GraalVM, knitr, gknit]
|
571
664
|
date: "29 October 2018"
|
572
665
|
output:
|
573
|
-
|
666
|
+
pdf\_document:
|
574
667
|
includes:
|
575
|
-
|
576
|
-
|
668
|
+
in\_header: ["../../sty/galaaz.sty"]
|
669
|
+
number\_sections: yes
|
577
670
|
---
|
578
671
|
```
|
579
672
|
|
673
|
+
|
580
674
|
# Conclusion
|
581
675
|
|
582
|
-
|
583
|
-
|
584
|
-
|
585
|
-
|
586
|
-
|
676
|
+
In order to do reproducible research, one of the main basic tools needed is a systhem that
|
677
|
+
allows "literate programming" where text, code and possibly a set of files can be compiled
|
678
|
+
onto a report that can be easily distributed to peers. Peers should be able to use this
|
679
|
+
same set of files to rerun the compilation by their own obtaining the exact same original
|
680
|
+
report. gKnit is such a system for Ruby and R. It uses __R Markdown__ to integrate
|
681
|
+
text and code chunks, where code chunks can either be part of the __R Markdwon__ file or
|
682
|
+
be imported from files in the system. Ideally, in reproducible research, all the files
|
683
|
+
needed to rebuild a report should be easilly packed together (in the same zipped directory)
|
684
|
+
and distributed to peers for reexecution.
|
685
|
+
|
686
|
+
One of the promises of Oracle's GraalVM is that users/developers will be able to use the best tool
|
687
|
+
for their task at hand, independently of the programming language the tool was written on.
|
688
|
+
We developed and implemented Galaaz atop the GraalVM and Truffle interop messages and
|
689
|
+
the time and effort to wrap Ruby over R - Galaaz - or to
|
690
|
+
wrap Knitr with gKnit was a fraction of a fraction of a fraction (one man effort for a couple
|
691
|
+
of hours a day, for approximately six months) of the time require to
|
587
692
|
implement the original tools. Trying to reimplement all R packages in Ruby would require the
|
588
|
-
same effort it is taking Python to implement NumPy,
|
693
|
+
same effort it is taking Python to implement NumPy, Pandas and all supporting libraries and it
|
589
694
|
is unlikely that this effort would ever be done. GraalVM has allowed Ruby to profit "almost
|
590
695
|
for free" from this huge set of libraries and tools that make R one of the most used
|
591
696
|
languages for data analysis and machine learning.
|
592
697
|
|
593
|
-
More interesting
|
698
|
+
More interesting than wrapping the R libraries with Ruby, is that Ruby adds
|
594
699
|
value to R, by allowing developers to use powerful and modern constructs for code reuse that
|
595
700
|
are not the strong points of R. As shown in this blog, R and Ruby can easily communicate
|
596
701
|
and R can be structured in classes and modules in a way that greatly expands its power and
|
@@ -621,3 +726,7 @@ the gnu compiler and tools should be enough. I am not sure what is needed on th
|
|
621
726
|
## Usage
|
622
727
|
|
623
728
|
* gknit \<filename\>
|
729
|
+
|
730
|
+
|
731
|
+
# References
|
732
|
+
|