galaaz 0.4.1 → 0.4.2
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/Rakefile +29 -0
- data/bin/gknit +208 -10
- data/bin/gknit2 +14 -0
- data/bin/gknit2~ +6 -0
- data/bin/prepareR.rb +3 -0
- data/bin/prepareR.rb~ +1 -0
- data/bin/tmp.py +51 -0
- data/blogs/dev/dev.Rmd +70 -0
- data/blogs/dev/dev.Rmd~ +104 -0
- data/blogs/dev/dev.html +209 -0
- data/blogs/dev/dev.md +72 -0
- data/blogs/dev/dev_files/figure-html/bubble-1.png +0 -0
- data/blogs/dev/model.rb +41 -0
- data/blogs/galaaz_ggplot/galaaz_ggplot.Rmd +55 -27
- data/blogs/galaaz_ggplot/galaaz_ggplot.aux +44 -0
- data/blogs/galaaz_ggplot/galaaz_ggplot.dvi +0 -0
- data/blogs/galaaz_ggplot/galaaz_ggplot.html +17 -4
- data/blogs/galaaz_ggplot/galaaz_ggplot.out +10 -0
- data/blogs/galaaz_ggplot/galaaz_ggplot.pdf +0 -0
- data/blogs/galaaz_ggplot/galaaz_ggplot.tex +630 -0
- data/blogs/galaaz_ggplot/midwest.Rmd +1 -1
- data/blogs/galaaz_ggplot/midwest_external_png +13 -0
- data/blogs/galaaz_ggplot/midwest_external_png~ +1 -0
- data/blogs/gknit/gknit.Rmd +500 -0
- data/blogs/gknit/gknit.Rmd~ +184 -0
- data/blogs/gknit/gknit.Rnd~ +17 -0
- data/blogs/gknit/gknit.html +528 -0
- data/blogs/gknit/gknit.md +628 -0
- data/blogs/gknit/gknit.pdf +0 -0
- data/blogs/gknit/gknit.tex +745 -0
- data/blogs/gknit/gknit_files/figure-html/bubble-1.png +0 -0
- data/blogs/gknit/gknit_files/figure-html/diverging_bar.png +0 -0
- data/blogs/gknit/model.rb +41 -0
- data/blogs/gknit/model.rb~ +46 -0
- data/blogs/ruby_plot/figures/dose_len.png +0 -0
- data/blogs/ruby_plot/figures/facet_by_delivery.png +0 -0
- data/blogs/ruby_plot/figures/facet_by_dose.png +0 -0
- data/blogs/ruby_plot/figures/facets_by_delivery_color.png +0 -0
- data/blogs/ruby_plot/figures/facets_by_delivery_color2.png +0 -0
- data/blogs/ruby_plot/figures/facets_with_decorations.png +0 -0
- data/blogs/ruby_plot/figures/facets_with_jitter.png +0 -0
- data/blogs/ruby_plot/figures/facets_with_points.png +0 -0
- data/blogs/ruby_plot/figures/final_box_plot.png +0 -0
- data/blogs/ruby_plot/figures/final_violin_plot.png +0 -0
- data/blogs/ruby_plot/figures/violin_with_jitter.png +0 -0
- data/blogs/ruby_plot/ruby_plot.Rmd +680 -0
- data/blogs/ruby_plot/ruby_plot.Rmd~ +215 -0
- data/blogs/ruby_plot/ruby_plot.html +563 -0
- data/blogs/ruby_plot/ruby_plot.md +731 -0
- data/blogs/ruby_plot/ruby_plot.pdf +0 -0
- data/blogs/ruby_plot/ruby_plot.tex +458 -0
- data/examples/sthda_ggplot/all.rb +0 -6
- data/examples/sthda_ggplot/two_variables_cont_bivariate/geom_hex.rb +1 -1
- data/examples/sthda_ggplot/two_variables_cont_cont/misc.rb +1 -1
- data/examples/sthda_ggplot/two_variables_disc_cont/geom_bar.rb +2 -2
- data/examples/sthda_ggplot/two_variables_disc_disc/geom_jitter.rb +0 -1
- data/lib/R/eng_ruby.R +62 -0
- data/lib/R/eng_ruby.R~ +63 -0
- data/lib/R_interface/capture_plot.rb~ +23 -0
- data/lib/{R → R_interface}/expression.rb +0 -0
- data/lib/{R → R_interface}/r.rb +10 -1
- data/lib/{R → R_interface}/r.rb~ +0 -0
- data/lib/{R → R_interface}/r_methods.rb +21 -5
- data/lib/{R → R_interface}/rbinary_operators.rb +6 -1
- data/lib/R_interface/rclosure.rb +38 -0
- data/lib/{R → R_interface}/rdata_frame.rb +0 -0
- data/lib/R_interface/rdevices.R +31 -0
- data/lib/R_interface/rdevices.rb +225 -0
- data/lib/{R/rclosure.rb → R_interface/rdevices.rb~} +3 -10
- data/lib/{R → R_interface}/renvironment.rb +0 -0
- data/lib/{R → R_interface}/rexpression.rb +0 -0
- data/lib/{R → R_interface}/rindexed_object.rb +0 -0
- data/lib/{R → R_interface}/rlanguage.rb +0 -0
- data/lib/{R → R_interface}/rlist.rb +0 -0
- data/lib/{R → R_interface}/rmatrix.rb +0 -0
- data/lib/{R → R_interface}/rmd_indexed_object.rb +0 -0
- data/lib/{R → R_interface}/robject.rb +5 -0
- data/lib/{R → R_interface}/rpkg.rb +0 -0
- data/lib/{R → R_interface}/rsupport.rb +49 -13
- data/lib/{R → R_interface}/rsupport_scope.rb +0 -0
- data/lib/{R → R_interface}/rsymbol.rb +1 -0
- data/lib/{R → R_interface}/ruby_callback.rb +0 -0
- data/lib/{R → R_interface}/ruby_extensions.rb +2 -1
- data/lib/{R → R_interface}/runary_operators.rb +0 -0
- data/lib/{R → R_interface}/rvector.rb +0 -0
- data/lib/galaaz.rb +4 -2
- data/lib/gknit.rb +27 -0
- data/lib/gknit.rb~ +26 -0
- data/lib/gknit/knitr_engine.rb +120 -0
- data/lib/gknit/knitr_engine.rb~ +102 -0
- data/lib/gknit/ruby_engine.rb +70 -0
- data/lib/gknit/ruby_engine.rb~ +72 -0
- data/lib/util/exec_ruby.rb +8 -7
- data/lib/util/inline_file.rb +70 -0
- data/lib/util/inline_file.rb~ +23 -0
- data/r_requires/ggplot.rb +1 -8
- data/r_requires/knitr.rb +27 -0
- data/r_requires/knitr.rb~ +4 -0
- data/specs/r_language.spec.rb +22 -0
- data/specs/r_plots.spec.rb +72 -0
- data/specs/r_plots.spec.rb~ +37 -0
- data/specs/tmp.rb +255 -1
- data/version.rb +1 -1
- metadata +89 -39
@@ -4,7 +4,7 @@ screen, bellow, we generate an 'svg' image and then include it in this document.
|
|
4
4
|
generate and image, the R.svg device is used. To generate the plot on the screen, use the R.awt
|
5
5
|
device, as commented on the code.
|
6
6
|
|
7
|
-
```{
|
7
|
+
```{ruby midwest_rb, warning=FALSE}
|
8
8
|
require 'galaaz'
|
9
9
|
require 'ggplot'
|
10
10
|
|
@@ -0,0 +1,13 @@
|
|
1
|
+
![Midwest Plot](https://user-images.githubusercontent.com/3999729/46742999-87bc2480-cc7e-11e8-9f16-31c3437e4a58.PNG){width=70%}
|
2
|
+
|
3
|
+
![Midwest Plot with 'glm' function and modified theme](https://user-images.githubusercontent.com/3999729/47120345-a903ae80-d244-11e8-9be3-a0db13cf51ab.PNG)
|
4
|
+
|
5
|
+
|
6
|
+
Sir Galahad (/ˈɡæləhæd/; sometimes referred to as Galeas /ɡəˈliːəs/ or Galath /ˈɡæləθ/),
|
7
|
+
in Arthurian legend, is a knight of King Arthur's Round Table and one of the three
|
8
|
+
achievers of the Holy Grail. He is the illegitimate son of Sir Lancelot and Elaine of
|
9
|
+
Corbenic, and is renowned for his gallantry and purity as the most perfect of all knights.
|
10
|
+
Emerging quite late in the medieval Arthurian tradition, Sir Galahad first appears in the
|
11
|
+
Lancelot–Grail cycle, and his story is taken up in later works such as the Post-Vulgate
|
12
|
+
Cycle and Sir Thomas Malory's Le Morte d'Arthur. His name should not be mistaken with
|
13
|
+
Galehaut, a different knight from Arthurian legend.
|
@@ -0,0 +1 @@
|
|
1
|
+
![Midwest Plot with 'glm' function and modified theme](https://user-images.githubusercontent.com/3999729/47120345-a903ae80-d244-11e8-9be3-a0db13cf51ab.PNG)
|
@@ -0,0 +1,500 @@
|
|
1
|
+
---
|
2
|
+
title: "gKnit - Ruby and R Knitting with Galaaz in GraalVM"
|
3
|
+
author: "Rodrigo Botafogo"
|
4
|
+
tags: [Galaaz, Ruby, R, TruffleRuby, FastR, GraalVM, knitr]
|
5
|
+
date: "19 October 2018"
|
6
|
+
output:
|
7
|
+
html_document:
|
8
|
+
self_contained: true
|
9
|
+
keep_md: true
|
10
|
+
pdf_document:
|
11
|
+
includes:
|
12
|
+
in_header: ["../../sty/galaaz.sty"]
|
13
|
+
number_sections: yes
|
14
|
+
---
|
15
|
+
|
16
|
+
```{r setup, echo=FALSE}
|
17
|
+
|
18
|
+
```
|
19
|
+
|
20
|
+
# Introduction
|
21
|
+
|
22
|
+
The idea of "literate programming" was first introduced by Donald Knuth in the 1980's.
|
23
|
+
The main intention of this approach was to develop software interspersing macro snippets,
|
24
|
+
traditional source code, and a natural language such as English that could be compiled into
|
25
|
+
executable code and at the same time easily read by a human developer. According to Knuth
|
26
|
+
"The practitioner of
|
27
|
+
literate programming can be regarded as an essayist, whose main concern is with exposition
|
28
|
+
and excellence of style."
|
29
|
+
|
30
|
+
The idea of literate programming evolved into the idea of reproducible research, in which
|
31
|
+
all the data, software code, documentation, graphics etc. needed to reproduce the research
|
32
|
+
and its reports could be included in a
|
33
|
+
single document or set of documents that when distributed to peers could be rerun generating
|
34
|
+
the same output and reports.
|
35
|
+
|
36
|
+
The R community has put a great deal of effort in reproducible research. In 2002, Sweave was
|
37
|
+
introduced and it allowed mixing R code with Latex generating high quality PDF documents. Those
|
38
|
+
documents could include the code, the result of executing the code, graphics and text. This
|
39
|
+
contained the whole narrative to reproduce the research. But Sweave had many problems and in
|
40
|
+
2012, Knitr, developed by Yihui Xie from RStudio was released, solving many of the long lasting
|
41
|
+
problems from Sweave and including in one single package many extensions and add-on packages that
|
42
|
+
were necessary for Sweave.
|
43
|
+
|
44
|
+
With Knitr, R markdown was also developed, an extension the the
|
45
|
+
Markdown format. With R markdown and Knitr it is possible to generate reports in a multitude
|
46
|
+
of formats such as HTML, markdown, Latex, PDF, dvi, etc. R markdown also allows the use of
|
47
|
+
multiple programming languages in the same document. In R markdown text is interspersed with
|
48
|
+
code chunks that can be executed and both the code as the result of executing the code can become
|
49
|
+
part of the final report. Although R markdown allows multiple programming languages in the
|
50
|
+
same document, only R and Python (with
|
51
|
+
the reticulate package) can persist variables between chunks. For other languages, such as
|
52
|
+
Ruby, every chunk will start a new process and thus all data is lost between chunks, unless it
|
53
|
+
is somehow stored in a data file that is read by the next chunk.
|
54
|
+
|
55
|
+
Being able to persist data
|
56
|
+
between chunks is critical for literate programming otherwise the flow of the narrative is lost
|
57
|
+
by all the effort of having to save data and then reload it. Probably, because of this impossibility,
|
58
|
+
it is very rare to see any R markdown document document in the Ruby community.
|
59
|
+
|
60
|
+
In the Python community, the same effort to have code and text in an integrated environment
|
61
|
+
started also on the first decade of 2000. In 2006 iPython 0.7.2 was released. In 2014,
|
62
|
+
Fernando Pérez, spun off project Jupyter from iPython creating a web-based interactive
|
63
|
+
computation environment. Jupyter can now be used with many languages, including Ruby with the
|
64
|
+
iruby gem (https://github.com/SciRuby/iruby). I am not sure if multiple languages can be used
|
65
|
+
in a Jupyter notebook.
|
66
|
+
|
67
|
+
# gKnitting a Document
|
68
|
+
|
69
|
+
This document describes gKnit. gKnit uses Knitr and R markdown to knit a document in Ruby or R
|
70
|
+
and output it in any of the
|
71
|
+
available formats for R markdown. The only difference between gKnit and normal Knitr documents
|
72
|
+
is that gKnit runs atop of GraalVM, and Galaaz (an integration library between Ruby and R).
|
73
|
+
Another blog post on Galaaz and its integration with ggplot2 can be found at:
|
74
|
+
https://towardsdatascience.com/ruby-plotting-with-galaaz-an-example-of-tightly-coupling-ruby-and-r-in-graalvm-520b69e21021. With Galaaz, gKnit can knit documents in Ruby and R and both
|
75
|
+
Ruby and R execute on the same process and memory, variables, classes, etc.
|
76
|
+
will be preserved between chunks of code.
|
77
|
+
|
78
|
+
This is not a blog post on rmarkdown, and the interested user is directed to
|
79
|
+
|
80
|
+
* https://rmarkdown.rstudio.com/ or
|
81
|
+
* https://bookdown.org/yihui/rmarkdown/ for detailed information on its capabilities and use.
|
82
|
+
|
83
|
+
Here, we will describe quickly the main aspects of R markdown, so the user can start gKnitting
|
84
|
+
Ruby and R documents quickly.
|
85
|
+
|
86
|
+
## The Yaml header
|
87
|
+
|
88
|
+
An R markdown document should start with a Yaml header and be stored in a file with '.Rmd' extension.
|
89
|
+
This document has the following header for gKitting an HTML document.
|
90
|
+
|
91
|
+
```
|
92
|
+
---
|
93
|
+
title: "gKnit - Ruby and R Knitting with Galaaz in GraalVM"
|
94
|
+
author: "Rodrigo Botafogo"
|
95
|
+
tags: [Galaaz, Ruby, R, TruffleRuby, FastR, GraalVM, knitr, gknit]
|
96
|
+
date: "29 October 2018"
|
97
|
+
output:
|
98
|
+
html_document:
|
99
|
+
keep_md: true
|
100
|
+
---
|
101
|
+
```
|
102
|
+
|
103
|
+
For more information on the options in the Yaml header, check https://bookdown.org/yihui/rmarkdown/html-document.html.
|
104
|
+
|
105
|
+
## R Markdown formatting
|
106
|
+
|
107
|
+
Document formatting can be done with simple markups such as:
|
108
|
+
|
109
|
+
### Headers
|
110
|
+
|
111
|
+
```
|
112
|
+
# Header 1
|
113
|
+
|
114
|
+
## Header 2
|
115
|
+
|
116
|
+
### Header 3
|
117
|
+
|
118
|
+
```
|
119
|
+
|
120
|
+
### Lists
|
121
|
+
|
122
|
+
```
|
123
|
+
Unordered lists:
|
124
|
+
|
125
|
+
* Item 1
|
126
|
+
* Item 2
|
127
|
+
+ Item 2a
|
128
|
+
+ Item 2b
|
129
|
+
```
|
130
|
+
|
131
|
+
```
|
132
|
+
Ordered Lists
|
133
|
+
|
134
|
+
1. Item 1
|
135
|
+
2. Item 2
|
136
|
+
3. Item 3
|
137
|
+
+ Item 3a
|
138
|
+
+ Item 3b
|
139
|
+
```
|
140
|
+
|
141
|
+
Please, go to https://rmarkdown.rstudio.com/authoring_basics.html, for more R markdown formatting.
|
142
|
+
|
143
|
+
## Code Chunks
|
144
|
+
|
145
|
+
Running and executing Ruby and R code is actually what really interests us is this blog. Inserting
|
146
|
+
a code chunk is done by adding code in a block delimited by three back ticks followed by a
|
147
|
+
block with the engine name (r, ruby, rb, include, others), an optional chunk_label and optional
|
148
|
+
options, as shown bellow:
|
149
|
+
|
150
|
+
````
|
151
|
+
```{engine_name [chunk_label], [chunk_options]}`r ''`
|
152
|
+
```
|
153
|
+
````
|
154
|
+
|
155
|
+
for instance, let's add an R chunk to the document labeled 'first_r_chunk'. In this case, the
|
156
|
+
code should not be shown in the document, so the option 'echo=FALSE' was added.
|
157
|
+
|
158
|
+
````
|
159
|
+
```{r first_r_chunk, echo = FALSE}`r ''`
|
160
|
+
```
|
161
|
+
````
|
162
|
+
|
163
|
+
A description of the available chunk options can be found in the documentation cited above.
|
164
|
+
|
165
|
+
For including a Ruby chunk, just change the name of the engine to ruby as follows:
|
166
|
+
|
167
|
+
````
|
168
|
+
```{ruby first_ruby_chunk}`r ''`
|
169
|
+
```
|
170
|
+
````
|
171
|
+
|
172
|
+
In this example, the ruby chunk is called 'first_ruby_chunk'. One important aspect of chunk
|
173
|
+
labels is that they cannot be duplicate. If a chunk label is duplicate, the knitting will
|
174
|
+
stop with an error.
|
175
|
+
|
176
|
+
### R chunks
|
177
|
+
|
178
|
+
Let's now add an R chunk to this document. In this example, a vector 'r_vec' is created and
|
179
|
+
a new function 'redef_sum' is defined. The chunk specification is
|
180
|
+
|
181
|
+
````
|
182
|
+
```{r data_creation}`r ''`
|
183
|
+
r_vec <- c(1, 2, 3, 4, 5)
|
184
|
+
|
185
|
+
redef_sum <- function(...) {
|
186
|
+
Reduce(sum, as.list(...))
|
187
|
+
}
|
188
|
+
```
|
189
|
+
````
|
190
|
+
|
191
|
+
and this is how it will look like once executed. From now on, we will not show the chunk
|
192
|
+
definition any longer.
|
193
|
+
|
194
|
+
|
195
|
+
```{r data_creation}
|
196
|
+
r_vec <- c(1, 2, 3, 4, 5)
|
197
|
+
|
198
|
+
redef_sum <- function(...) {
|
199
|
+
Reduce(sum, as.list(...))
|
200
|
+
}
|
201
|
+
```
|
202
|
+
|
203
|
+
We can, possibly in another chunk, access the vector and call the function as follows:
|
204
|
+
|
205
|
+
```{r using_previous}
|
206
|
+
print(r_vec)
|
207
|
+
print(redef_sum(r_vec))
|
208
|
+
```
|
209
|
+
|
210
|
+
```{r bubble}
|
211
|
+
# load package and data
|
212
|
+
library(ggplot2)
|
213
|
+
data(mpg, package="ggplot2")
|
214
|
+
# mpg <- read.csv("http://goo.gl/uEeRGu")
|
215
|
+
|
216
|
+
mpg_select <- mpg[mpg$manufacturer %in% c("audi", "ford", "honda", "hyundai"), ]
|
217
|
+
|
218
|
+
# Scatterplot
|
219
|
+
theme_set(theme_bw()) # pre-set the bw theme.
|
220
|
+
g <- ggplot(mpg_select, aes(displ, cty)) +
|
221
|
+
labs(subtitle="mpg: Displacement vs City Mileage",
|
222
|
+
title="Bubble chart")
|
223
|
+
|
224
|
+
g + geom_jitter(aes(col=manufacturer, size=hwy)) +
|
225
|
+
geom_smooth(aes(col=manufacturer), method="lm", se=F)
|
226
|
+
```
|
227
|
+
|
228
|
+
### Ruby chunks
|
229
|
+
|
230
|
+
In the same way that an R chunk was created, let's now create a Ruby chunk. One important aspect
|
231
|
+
of Ruby is that in Ruby every evaluation of a chunk occurs on its own local scope, so, creating
|
232
|
+
a variable in a chunk will be out of scope in the next chunk. To make sure that variables are
|
233
|
+
available between chunks, they should be made global.
|
234
|
+
|
235
|
+
In this chunk, variable '\$a', '\$b' and '\$c' are standard Ruby variables and '\$vec' and '\$vec2'
|
236
|
+
are two vectors created by a call to FastR. It should be clear that there is no requirement
|
237
|
+
in gknit to call or use R functions. gKnit will knit standard Ruby code, or even general
|
238
|
+
text without code.
|
239
|
+
|
240
|
+
```{ruby split_data}
|
241
|
+
$a = [1, 2, 3]
|
242
|
+
$b = "US$ 250.000"
|
243
|
+
$c = "Inline text in a Heading"
|
244
|
+
|
245
|
+
$vec = R.c(1, 2, 3)
|
246
|
+
$vec2 = R.c(10, 20, 30)
|
247
|
+
```
|
248
|
+
|
249
|
+
In this next block, variables '\$a', '\$vec' and '\$vec2' are used and printed.
|
250
|
+
|
251
|
+
```{ruby split2}
|
252
|
+
puts $a
|
253
|
+
puts $vec * $vec2
|
254
|
+
```
|
255
|
+
|
256
|
+
### Accessing R from Ruby
|
257
|
+
|
258
|
+
One of the nice aspects of Galaaz on GraalVM, is that variables and functions defined in R, can
|
259
|
+
be easily accessed from Ruby. This next chunk, reads data from R and uses the 'redef_fun'
|
260
|
+
function defined previously. To access an R variable from Ruby the '~' function should be
|
261
|
+
applied to the Ruby symbol representing the R variable. Since the R variable is called 'r_vec',
|
262
|
+
in Ruby, the symbol to acess it is ':r_vec' and thus '~:r_vec' retrieves the value of the
|
263
|
+
variable.
|
264
|
+
|
265
|
+
```{ruby access_r}
|
266
|
+
puts ~:r_vec
|
267
|
+
```
|
268
|
+
|
269
|
+
In order to call an R function, the 'R.' module is used as follows
|
270
|
+
|
271
|
+
```{ruby call_r_func}
|
272
|
+
puts R.redef_sum($vec)
|
273
|
+
```
|
274
|
+
|
275
|
+
### Inline Ruby code
|
276
|
+
|
277
|
+
Knitr allows inserting R inline by adding
|
278
|
+
```{rb puts "`r code`"}
|
279
|
+
```
|
280
|
+
. Unfortunately, this is not possible with Ruby code as there is no provision in knitr for
|
281
|
+
adding this kind of inline engine. However, gKnit allows adding inline Ruby code with the
|
282
|
+
'rb' engine. The following text will create and inline Ruby text:
|
283
|
+
|
284
|
+
````
|
285
|
+
This is some text with inline Ruby accessing variable \$b which has value:
|
286
|
+
```{rb puts "```{rb puts $b}\n```"}
|
287
|
+
```
|
288
|
+
and is followed by some other text!
|
289
|
+
````
|
290
|
+
|
291
|
+
The result of executing the above chunk is the following sentence with inline Ruby code
|
292
|
+
|
293
|
+
<div style="margin-bottom:50px;">
|
294
|
+
</div>
|
295
|
+
|
296
|
+
This is some text with inline Ruby accessing variable \$b which has value:
|
297
|
+
```{rb puts $b}
|
298
|
+
```
|
299
|
+
and is followed by some other text!
|
300
|
+
|
301
|
+
<div style="margin-bottom:50px;">
|
302
|
+
</div>
|
303
|
+
|
304
|
+
In an inline block, it is possible to execute multiple Ruby statements by adding a semicolon
|
305
|
+
between them:
|
306
|
+
|
307
|
+
````
|
308
|
+
Multiple statements in the 'rb' engine use semicolon:
|
309
|
+
```{rb puts "```{rb puts $a, puts $b}\n```"}
|
310
|
+
```
|
311
|
+
````
|
312
|
+
|
313
|
+
<div style="margin-bottom:50px;">
|
314
|
+
</div>
|
315
|
+
|
316
|
+
|
317
|
+
Multiple statements in the 'rb' engine use semicolon:
|
318
|
+
```{rb puts $a; puts $b}
|
319
|
+
```
|
320
|
+
|
321
|
+
<div style="margin-bottom:50px;">
|
322
|
+
</div>
|
323
|
+
|
324
|
+
|
325
|
+
```{rb puts "### #{$c}"}
|
326
|
+
```
|
327
|
+
|
328
|
+
Sometimes one wants to add an inline text in a heading. To do that in Ruby the whole heading
|
329
|
+
needs to be returned by the inline Ruby engine. For example the heading above, was created by
|
330
|
+
the following chunk:
|
331
|
+
|
332
|
+
````
|
333
|
+
```{rb puts %q(```{rb puts "### #{$c}"}\n```)}
|
334
|
+
```
|
335
|
+
````
|
336
|
+
|
337
|
+
Remember that variable '$\c' was defined in a previous Ruby chunk and is now being used to
|
338
|
+
create the section heading for this section.
|
339
|
+
|
340
|
+
|
341
|
+
### Plotting
|
342
|
+
|
343
|
+
```{ruby diverging_bar}
|
344
|
+
require 'ggplot'
|
345
|
+
|
346
|
+
R.theme_set R.theme_bw
|
347
|
+
|
348
|
+
# Data Prep
|
349
|
+
mtcars = ~:mtcars
|
350
|
+
mtcars.car_name = R.rownames(:mtcars)
|
351
|
+
# compute normalized mpg
|
352
|
+
mtcars.mpg_z = ((mtcars.mpg - mtcars.mpg.mean)/mtcars.mpg.sd).round 2
|
353
|
+
mtcars.mpg_type = mtcars.mpg_z < 0 ? "below" : "above"
|
354
|
+
mtcars = mtcars[mtcars.mpg_z.order, :all]
|
355
|
+
# convert to factor to retain sorted order in plot
|
356
|
+
mtcars.car_name = mtcars.car_name.factor levels: mtcars.car_name
|
357
|
+
|
358
|
+
# Diverging Barcharts
|
359
|
+
# R.png
|
360
|
+
gg = mtcars.ggplot(E.aes(x: :car_name, y: :mpg_z, label: :mpg_z)) +
|
361
|
+
R.geom_bar(E.aes(fill: :mpg_type), stat: 'identity', width: 0.5) +
|
362
|
+
R.scale_fill_manual(name: "Mileage",
|
363
|
+
labels: R.c("Above Average", "Below Average"),
|
364
|
+
values: R.c("above": "#00ba38", "below": "#f8766d")) +
|
365
|
+
R.labs(subtitle: "Normalised mileage from 'mtcars'",
|
366
|
+
title: "Diverging Bars") +
|
367
|
+
R.coord_flip()
|
368
|
+
print gg
|
369
|
+
# R.dev__off
|
370
|
+
# R.include_graphics("Rplot001.png")
|
371
|
+
```
|
372
|
+
|
373
|
+
### Including Ruby files
|
374
|
+
|
375
|
+
R is a language that was created to be easy and fast for statisticians to use. It was not a
|
376
|
+
language to be used for developing large systems. Of course, there are large systems and
|
377
|
+
libraries in R, but the focus of the language is for developing statistical models and
|
378
|
+
distribute that to peers.
|
379
|
+
|
380
|
+
Ruby on the other hand, is a language for large software development. Systems written in
|
381
|
+
Ruby will have dozens or hundreds of files. In order to document a large system with
|
382
|
+
literate programming we cannot expect the developer to add all the files in a single '.Rmd'
|
383
|
+
file. gKnit provides the 'include' chunk engine to include a Ruby file as if it had being
|
384
|
+
typed in the '.Rmd' file.
|
385
|
+
|
386
|
+
To include a file the following chunk should be created, where <filename> is the name of
|
387
|
+
the file to be include and where the extension, if it is '.rb', does not need to be added.
|
388
|
+
If the 'relative' option is not included, then it is treated as TRUE. When 'relative' is
|
389
|
+
true, 'require_relative' semantics is used to load the file, when false, Ruby's $LOAD_PATH
|
390
|
+
is searched to find the file and it is 'require'd.
|
391
|
+
|
392
|
+
````
|
393
|
+
```{include <filename>, relative = <TRUE/FALSE>}`r ''`
|
394
|
+
```
|
395
|
+
````
|
396
|
+
|
397
|
+
Here we include file 'model.rb' which is in the same directory of this blog. This code
|
398
|
+
uses R 'caret' package to split a dataset in a train and test sets.
|
399
|
+
|
400
|
+
````
|
401
|
+
```{include model}`r ''`
|
402
|
+
```
|
403
|
+
````
|
404
|
+
|
405
|
+
```{include model}
|
406
|
+
```
|
407
|
+
|
408
|
+
```{ruby model_partition}
|
409
|
+
mtcars = ~:mtcars
|
410
|
+
model = Model.new(mtcars, percent_train: 0.8)
|
411
|
+
model.partition(:mpg)
|
412
|
+
puts model.train.head
|
413
|
+
puts model.test.head
|
414
|
+
```
|
415
|
+
|
416
|
+
### Documenting Gems
|
417
|
+
|
418
|
+
gKnit also allows developers to document and load files that are not in the same directory
|
419
|
+
of the '.Rmd' file. When using 'relative = FALSE' in a chunk header, gKnit will look for the
|
420
|
+
file in Ruby's \$LOAD_PATH and load it if found.
|
421
|
+
|
422
|
+
Here is an example of loading the 'continuation.rb' file from TruffleRuby.
|
423
|
+
|
424
|
+
````
|
425
|
+
```{include continuation, relative = FALSE}`r ''`
|
426
|
+
```
|
427
|
+
````
|
428
|
+
|
429
|
+
|
430
|
+
```{include continuation, relative = FALSE}
|
431
|
+
```
|
432
|
+
|
433
|
+
## Converting to PDF
|
434
|
+
|
435
|
+
One of the beauties of knitr is that the same input can be converted to many different outputs.
|
436
|
+
One very useful format, is, of course, PDF. In order to converted an R markdown file to PDF
|
437
|
+
it is necessary to have LaTeX installed on the system. We will not explain here how to
|
438
|
+
install LaTeX as there are plenty of documents on the web showing how to proceed.
|
439
|
+
|
440
|
+
gKnit comes with a simple LaTeX style file for gknitting this blog as a PDF document. Here is
|
441
|
+
the Yaml header to generate this blog in PDF format instead of HTML:
|
442
|
+
|
443
|
+
```
|
444
|
+
---
|
445
|
+
title: "gKnit - Ruby and R Knitting with Galaaz in GraalVM"
|
446
|
+
author: "Rodrigo Botafogo"
|
447
|
+
tags: [Galaaz, Ruby, R, TruffleRuby, FastR, GraalVM, knitr, gknit]
|
448
|
+
date: "29 October 2018"
|
449
|
+
output:
|
450
|
+
pdf_document:
|
451
|
+
includes:
|
452
|
+
in_header: ["../../sty/galaaz.sty"]
|
453
|
+
number_sections: yes
|
454
|
+
---
|
455
|
+
```
|
456
|
+
|
457
|
+
# Conclusion
|
458
|
+
|
459
|
+
One of the promises of GraalVM is that users/developers will be able to use the best tool
|
460
|
+
for their task at hand, independently of the programming language the tool was written. Galaaz
|
461
|
+
and gKnit are not trivial implementations atop the GraalVM and Truffle interop messages;
|
462
|
+
however, the time and effort it took to wrap Ruby over R - Galaaz - (not finished yet) or to
|
463
|
+
wrap Knitr with gKnit is a fraction of a fraction of a fraction of the time require to
|
464
|
+
implement the original tools. Trying to reimplement all R packages in Ruby would require the
|
465
|
+
same effort it is taking Python to implement NumPy, Panda and all supporting libraries and it
|
466
|
+
is unlikely that this effort would ever be done. GraalVM has allowed Ruby to profit "almost
|
467
|
+
for free" from this huge set of libraries and tools that make R one of the most used
|
468
|
+
languages for data analysis and machine learning.
|
469
|
+
|
470
|
+
More interesting though than being able to wrap the R libraries with Ruby, is that Ruby adds
|
471
|
+
value to R, by allowing developers to use powerful and modern constructs for code reuse that
|
472
|
+
are not the strong points of R. As shown in this blog, R and Ruby can easily communicate
|
473
|
+
and R can be structured in classes and modules in a way that greatly expands its power and
|
474
|
+
readability.
|
475
|
+
|
476
|
+
# Installing gKnit
|
477
|
+
|
478
|
+
## Prerequisites
|
479
|
+
|
480
|
+
* GraalVM (>= rc8)
|
481
|
+
* TruffleRuby
|
482
|
+
* FastR
|
483
|
+
|
484
|
+
The following R packages will be automatically installed when necessary, but could be installed prior
|
485
|
+
to using gKnit if desired:
|
486
|
+
|
487
|
+
* ggplot2
|
488
|
+
* gridExtra
|
489
|
+
* knitr
|
490
|
+
|
491
|
+
Installation of R packages requires a development environment and can be time consuming. In Linux,
|
492
|
+
the gnu compiler and tools should be enough. I am not sure what is needed on the Mac.
|
493
|
+
|
494
|
+
## Preparation
|
495
|
+
|
496
|
+
* gem install galaaz
|
497
|
+
|
498
|
+
## Usage
|
499
|
+
|
500
|
+
* gknit <filename>
|