galaaz 0.4.5 → 0.4.6
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +696 -270
- data/Rakefile +9 -22
- data/bin/gknit +2 -217
- data/bin/gknit_old_r +236 -0
- data/bin/grun +5 -0
- data/blogs/dev/dev.Rmd +7 -0
- data/blogs/dev/dev.html +34 -26
- data/blogs/dev/dev.md +40 -25
- data/blogs/dev/dev_files/figure-html/bubble-1.png +0 -0
- data/blogs/dev/dev_files/figure-html/diverging_bar. +0 -0
- data/blogs/dev/dev_files/figure-html/diverging_bar.png +0 -0
- data/blogs/galaaz_ggplot/galaaz_ggplot.Rmd +4 -4
- data/blogs/galaaz_ggplot/galaaz_ggplot.html +251 -59
- data/blogs/galaaz_ggplot/galaaz_ggplot.log +640 -0
- data/blogs/galaaz_ggplot/galaaz_ggplot.md +199 -95
- data/blogs/galaaz_ggplot/galaaz_ggplot.tex +45 -228
- data/blogs/galaaz_ggplot/midwest.png +0 -0
- data/blogs/galaaz_ggplot/scatter_plot.png +0 -0
- data/blogs/gknit/gknit.Rmd +271 -148
- data/blogs/manual/manual.Rmd +212 -0
- data/blogs/manual/manual.html +1832 -0
- data/blogs/manual/manual.md +751 -0
- data/blogs/manual/manual_files/figure-html/diverging_bar.png +0 -0
- data/blogs/ruby_plot/ruby_plot.Rmd +5 -69
- data/blogs/ruby_plot/ruby_plot.html +195 -236
- data/blogs/ruby_plot/ruby_plot.md +1 -261
- data/blogs/ruby_plot/ruby_plot_files/figure-html/dose_len.svg +38 -38
- data/examples/sthda_ggplot/two_variables_disc_cont/geom_dotplot.rb +5 -5
- data/examples/sthda_ggplot/two_variables_disc_cont/geom_jitter.rb +1 -0
- data/examples/sthda_ggplot/two_variables_disc_cont/geom_violin.rb +3 -7
- data/examples/sthda_ggplot/two_variables_error/geom_crossbar.rb +3 -1
- data/lib/R_interface/r.rb +12 -9
- data/lib/R_interface/r_methods.rb +2 -2
- data/lib/R_interface/rbinary_operators.rb +2 -20
- data/lib/R_interface/rdata_frame.rb +56 -9
- data/lib/R_interface/rdevices.R +0 -12
- data/lib/R_interface/rexpression.rb +0 -97
- data/lib/R_interface/rindexed_object.rb +12 -3
- data/lib/R_interface/rlanguage.rb +1 -1
- data/lib/R_interface/rlist.rb +29 -4
- data/lib/R_interface/rlogical_operators.rb +50 -0
- data/lib/R_interface/rmatrix.rb +7 -1
- data/lib/R_interface/robject.rb +29 -15
- data/lib/R_interface/rsupport.rb +74 -58
- data/lib/R_interface/rsymbol.rb +2 -1
- data/lib/R_interface/ruby_extensions.rb +11 -2
- data/lib/R_interface/rvector.rb +26 -11
- data/lib/gknit.rb +2 -0
- data/lib/gknit/include_engine.rb +57 -0
- data/lib/gknit/knitr_engine.rb +596 -50
- data/lib/gknit/rb_engine.rb +56 -0
- data/lib/gknit/ruby_engine.rb +13 -36
- data/lib/util/exec_ruby.rb +132 -21
- data/lib/util/inline_file.rb +9 -7
- data/specs/all.rb +5 -0
- data/specs/figures/bg.jpeg +0 -0
- data/specs/figures/bg.png +0 -0
- data/specs/figures/bg.svg +57 -0
- data/specs/figures/no_args.jpeg +0 -0
- data/specs/figures/no_args.png +0 -0
- data/specs/figures/no_args.svg +57 -0
- data/specs/figures/width_height.jpeg +0 -0
- data/specs/figures/width_height.png +0 -0
- data/specs/figures/width_height_units1.jpeg +0 -0
- data/specs/figures/width_height_units1.png +0 -0
- data/specs/figures/width_height_units2.jpeg +0 -0
- data/specs/figures/width_height_units2.png +0 -0
- data/specs/r_dataframe.spec.rb +29 -27
- data/specs/r_devices.spec.rb +347 -0
- data/specs/r_eval.spec.rb +10 -3
- data/specs/r_formula.spec.rb +2 -2
- data/specs/r_language.spec.rb +112 -0
- data/specs/r_list.spec.rb +174 -14
- data/specs/r_list_apply.spec.rb +17 -10
- data/specs/r_matrix.spec.rb +3 -3
- data/specs/r_vector_operators.spec.rb +13 -7
- data/specs/tmp.rb +42 -12
- data/version.rb +1 -1
- metadata +28 -24
- data/bin/gknit2 +0 -14
- data/bin/prepareR.rb +0 -3
- data/bin/tmp.py +0 -51
- data/blogs/gknit/gknit.html +0 -528
- data/blogs/gknit/gknit.md +0 -628
- data/blogs/gknit/gknit.pdf +0 -0
- data/blogs/gknit/gknit.tex +0 -745
- data/blogs/gknit/gknit_files/figure-html/bubble-1.png +0 -0
- data/blogs/gknit/gknit_files/figure-html/diverging_bar.png +0 -0
- data/blogs/ruby_plot/figures/dose_len.png +0 -0
- data/blogs/ruby_plot/figures/facet_by_delivery.png +0 -0
- data/blogs/ruby_plot/figures/facet_by_dose.png +0 -0
- data/blogs/ruby_plot/figures/facets_by_delivery_color.png +0 -0
- data/blogs/ruby_plot/figures/facets_by_delivery_color2.png +0 -0
- data/blogs/ruby_plot/figures/facets_with_decorations.png +0 -0
- data/blogs/ruby_plot/figures/facets_with_jitter.png +0 -0
- data/blogs/ruby_plot/figures/facets_with_points.png +0 -0
- data/blogs/ruby_plot/figures/final_box_plot.png +0 -0
- data/blogs/ruby_plot/figures/final_violin_plot.png +0 -0
- data/blogs/ruby_plot/figures/violin_with_jitter.png +0 -0
- data/lib/R/eng_ruby.R +0 -62
- data/lib/R_interface/rdevices.rb +0 -225
Binary file
|
Binary file
|
data/blogs/gknit/gknit.Rmd
CHANGED
@@ -1,16 +1,18 @@
|
|
1
1
|
---
|
2
|
-
title: "
|
3
|
-
author:
|
4
|
-
|
5
|
-
|
2
|
+
title: "How to do reproducible research in Ruby with gKnit"
|
3
|
+
author:
|
4
|
+
- "Rodrigo Botafogo"
|
5
|
+
- "Daniel Mossé - University of Pittsburgh"
|
6
|
+
tags: [Tech, Data Science, Ruby, R, GraalVM]
|
7
|
+
date: "20/02/2019"
|
6
8
|
output:
|
7
|
-
html_document:
|
8
|
-
self_contained: true
|
9
|
-
keep_md: true
|
10
9
|
pdf_document:
|
11
10
|
includes:
|
12
11
|
in_header: ["../../sty/galaaz.sty"]
|
13
12
|
number_sections: yes
|
13
|
+
html_document:
|
14
|
+
self_contained: true
|
15
|
+
keep_md: true
|
14
16
|
---
|
15
17
|
|
16
18
|
```{r setup, echo=FALSE}
|
@@ -21,7 +23,8 @@ output:
|
|
21
23
|
|
22
24
|
The idea of "literate programming" was first introduced by Donald Knuth in the 1980's.
|
23
25
|
The main intention of this approach was to develop software interspersing macro snippets,
|
24
|
-
traditional source code, and a natural language such as English
|
26
|
+
traditional source code, and a natural language such as English in a document
|
27
|
+
that could be compiled into
|
25
28
|
executable code and at the same time easily read by a human developer. According to Knuth
|
26
29
|
"The practitioner of
|
27
30
|
literate programming can be regarded as an essayist, whose main concern is with exposition
|
@@ -41,11 +44,11 @@ contained the whole narrative to reproduce the research. But Sweave had many pr
|
|
41
44
|
problems from Sweave and including in one single package many extensions and add-on packages that
|
42
45
|
were necessary for Sweave.
|
43
46
|
|
44
|
-
With Knitr, R markdown was also developed, an extension
|
47
|
+
With Knitr, R markdown was also developed, an extension to the
|
45
48
|
Markdown format. With R markdown and Knitr it is possible to generate reports in a multitude
|
46
49
|
of formats such as HTML, markdown, Latex, PDF, dvi, etc. R markdown also allows the use of
|
47
50
|
multiple programming languages in the same document. In R markdown text is interspersed with
|
48
|
-
code chunks that can be executed and both the code
|
51
|
+
code chunks that can be executed and both the code and its results can become
|
49
52
|
part of the final report. Although R markdown allows multiple programming languages in the
|
50
53
|
same document, only R and Python (with
|
51
54
|
the reticulate package) can persist variables between chunks. For other languages, such as
|
@@ -54,49 +57,64 @@ is somehow stored in a data file that is read by the next chunk.
|
|
54
57
|
|
55
58
|
Being able to persist data
|
56
59
|
between chunks is critical for literate programming otherwise the flow of the narrative is lost
|
57
|
-
by all the effort of having to save data and then reload it. Probably, because of
|
58
|
-
|
60
|
+
by all the effort of having to save data and then reload it. Probably, because of
|
61
|
+
this impossibility,
|
62
|
+
it is very rare to see any R markdown document in the Ruby community. Also, the use of
|
63
|
+
R markdown for the Ruby community would also require the Ruby developer to download R and
|
64
|
+
have some minimal knowledge of Knitr.
|
59
65
|
|
60
66
|
In the Python community, the same effort to have code and text in an integrated environment
|
61
|
-
started
|
67
|
+
started around the first decade of 2000. In 2006 iPython 0.7.2 was released. In 2014,
|
62
68
|
Fernando Pérez, spun off project Jupyter from iPython creating a web-based interactive
|
63
69
|
computation environment. Jupyter can now be used with many languages, including Ruby with the
|
64
70
|
iruby gem (https://github.com/SciRuby/iruby). I am not sure if multiple languages can be used
|
65
|
-
in a Jupyter notebook.
|
71
|
+
in a Jupyter notebook and if variables can persist between chunks.
|
66
72
|
|
67
73
|
# gKnitting a Document
|
68
74
|
|
69
75
|
This document describes gKnit. gKnit uses Knitr and R markdown to knit a document in Ruby or R
|
70
|
-
and output it in any of the
|
71
|
-
|
72
|
-
|
73
|
-
|
74
|
-
|
75
|
-
|
76
|
-
|
76
|
+
and output it in any of the available formats for R markdown.
|
77
|
+
gKnit runs atop of GraalVM, and Galaaz (an integration
|
78
|
+
library between Ruby and R). In gKnit, Ruby variables are persisted between chunks, making
|
79
|
+
it an ideal solution for literate programming in this language. Also, since it is based on
|
80
|
+
Galaaz, Ruby chunks can have access to R variables and Polyglot Programming with Ruby and R
|
81
|
+
is quite natural.
|
82
|
+
|
83
|
+
Galaaz has been describe already in the following posts:
|
77
84
|
|
78
|
-
|
85
|
+
* https://towardsdatascience.com/ruby-plotting-with-galaaz-an-example-of-tightly-coupling-ruby-and-r-in-graalvm-520b69e21021.
|
86
|
+
* https://medium.freecodecamp.org/how-to-make-beautiful-ruby-plots-with-galaaz-320848058857
|
87
|
+
|
88
|
+
This is not a blog post on R markdown, and the interested user is directed to the following links
|
89
|
+
for detailed information on its capabilities and use.
|
79
90
|
|
80
91
|
* https://rmarkdown.rstudio.com/ or
|
81
|
-
* https://bookdown.org/yihui/rmarkdown/
|
92
|
+
* https://bookdown.org/yihui/rmarkdown/
|
82
93
|
|
83
94
|
Here, we will describe quickly the main aspects of R markdown, so the user can start gKnitting
|
84
95
|
Ruby and R documents quickly.
|
85
96
|
|
86
97
|
## The Yaml header
|
87
98
|
|
88
|
-
An R markdown document should start with a Yaml header and be stored in a file with
|
89
|
-
This document has the following header for gKitting an HTML document.
|
99
|
+
An R markdown document should start with a Yaml header and be stored in a file with
|
100
|
+
'.Rmd' extension. This document has the following header for gKitting an HTML document.
|
90
101
|
|
91
102
|
```
|
92
103
|
---
|
93
|
-
title: "
|
94
|
-
author:
|
95
|
-
|
96
|
-
|
104
|
+
title: "How to do reproducible research in Ruby with gKnit"
|
105
|
+
author:
|
106
|
+
- "Rodrigo Botafogo"
|
107
|
+
- "Daniel Mossé - University of Pittsburgh"
|
108
|
+
tags: [Tech, Data Science, Ruby, R, GraalVM]
|
109
|
+
date: "20/02/2019"
|
97
110
|
output:
|
98
111
|
html_document:
|
112
|
+
self_contained: true
|
99
113
|
keep_md: true
|
114
|
+
pdf_document:
|
115
|
+
includes:
|
116
|
+
in_header: ["../../sty/galaaz.sty"]
|
117
|
+
number_sections: yes
|
100
118
|
---
|
101
119
|
```
|
102
120
|
|
@@ -140,62 +158,80 @@ Ordered Lists
|
|
140
158
|
|
141
159
|
Please, go to https://rmarkdown.rstudio.com/authoring_basics.html, for more R markdown formatting.
|
142
160
|
|
143
|
-
|
161
|
+
### R chunks
|
144
162
|
|
145
|
-
Running and executing Ruby and R code is actually what really interests us is this blog.
|
146
|
-
a code chunk is done by adding code in a block delimited by three back ticks
|
147
|
-
|
148
|
-
|
163
|
+
Running and executing Ruby and R code is actually what really interests us is this blog.
|
164
|
+
Inserting a code chunk is done by adding code in a block delimited by three back ticks
|
165
|
+
followed by an open
|
166
|
+
curly brace ('{') followed with the engine name (r, ruby, rb, include, ...), an
|
167
|
+
any optional chunk_label and options, as shown bellow:
|
149
168
|
|
150
169
|
````
|
151
170
|
```{engine_name [chunk_label], [chunk_options]}`r ''`
|
152
171
|
```
|
153
172
|
````
|
154
173
|
|
155
|
-
for instance, let's add an R chunk to the document labeled 'first_r_chunk'.
|
156
|
-
code
|
174
|
+
for instance, let's add an R chunk to the document labeled 'first_r_chunk'. This is
|
175
|
+
a very simple code just to create a variable and print it out. The code block should
|
176
|
+
be defined as follows:
|
157
177
|
|
158
178
|
````
|
159
|
-
```{r first_r_chunk
|
179
|
+
```{r first_r_chunk}`r ''`
|
180
|
+
vec <- c(1, 2, 3)
|
181
|
+
print(vec)
|
160
182
|
```
|
161
183
|
````
|
162
184
|
|
163
|
-
|
185
|
+
If this block is added to an R markdown document and gKnitted the result will be:
|
186
|
+
|
187
|
+
```{r first_r_chunk}
|
188
|
+
vec <- c(1, 2, 3)
|
189
|
+
print(vec)
|
190
|
+
```
|
164
191
|
|
165
|
-
|
192
|
+
Now let's say that we want to do some analysis in the code, but just print the result and not the
|
193
|
+
code itself. For this, we need to add the option 'echo = FALSE'.
|
166
194
|
|
167
195
|
````
|
168
|
-
```{
|
196
|
+
```{r second_r_chunk, echo = FALSE}`r ''`
|
197
|
+
vec2 <- c(10, 20, 30)
|
198
|
+
vec3 <- vec * vec2
|
199
|
+
print(vec3)
|
169
200
|
```
|
170
201
|
````
|
202
|
+
Here is how this block will show up in the document. Observe that the code is not shown
|
203
|
+
and we only see the execution result in a white box
|
171
204
|
|
172
|
-
|
173
|
-
|
174
|
-
|
205
|
+
```{r second_r_chunk, echo = FALSE}
|
206
|
+
vec2 <- c(10, 20, 30)
|
207
|
+
vec3 <- vec * vec2
|
208
|
+
print(vec3)
|
209
|
+
```
|
175
210
|
|
176
|
-
|
211
|
+
A description of the available chunk options can be found in the documentation cited above.
|
177
212
|
|
178
|
-
Let's
|
179
|
-
|
213
|
+
Let's add another R chunkd with a function definition. In this example, a vector
|
214
|
+
'r_vec' is created and
|
215
|
+
a new function 'reduce_sum' is defined. The chunk specification is
|
180
216
|
|
181
217
|
````
|
182
218
|
```{r data_creation}`r ''`
|
183
219
|
r_vec <- c(1, 2, 3, 4, 5)
|
184
220
|
|
185
|
-
|
221
|
+
reduce_sum <- function(...) {
|
186
222
|
Reduce(sum, as.list(...))
|
187
223
|
}
|
188
224
|
```
|
189
225
|
````
|
190
226
|
|
191
|
-
and this is how it will look like once executed. From now on, we will not
|
192
|
-
definition any longer.
|
227
|
+
and this is how it will look like once executed. From now on, we will not
|
228
|
+
show the chunk definition any longer.
|
193
229
|
|
194
230
|
|
195
231
|
```{r data_creation}
|
196
232
|
r_vec <- c(1, 2, 3, 4, 5)
|
197
233
|
|
198
|
-
|
234
|
+
reduce_sum <- function(...) {
|
199
235
|
Reduce(sum, as.list(...))
|
200
236
|
}
|
201
237
|
```
|
@@ -204,14 +240,19 @@ We can, possibly in another chunk, access the vector and call the function as fo
|
|
204
240
|
|
205
241
|
```{r using_previous}
|
206
242
|
print(r_vec)
|
207
|
-
print(
|
243
|
+
print(reduce_sum(r_vec))
|
208
244
|
```
|
245
|
+
### R Graphics with ggplot
|
209
246
|
|
210
|
-
|
247
|
+
In the following chunk, we create a bubble chart in R using ggplot and include it in
|
248
|
+
this document. Note that there is no directive in the code to include the image, this
|
249
|
+
occurs automatically. The 'mpg' dataframe is natively available to R and to Galaaz as
|
250
|
+
well.
|
251
|
+
|
252
|
+
```{r bubble, dev='png'}
|
211
253
|
# load package and data
|
212
254
|
library(ggplot2)
|
213
255
|
data(mpg, package="ggplot2")
|
214
|
-
# mpg <- read.csv("http://goo.gl/uEeRGu")
|
215
256
|
|
216
257
|
mpg_select <- mpg[mpg$manufacturer %in% c("audi", "ford", "honda", "hyundai"), ]
|
217
258
|
|
@@ -221,42 +262,64 @@ g <- ggplot(mpg_select, aes(displ, cty)) +
|
|
221
262
|
labs(subtitle="mpg: Displacement vs City Mileage",
|
222
263
|
title="Bubble chart")
|
223
264
|
|
224
|
-
g + geom_jitter(aes(col=manufacturer, size=hwy)) +
|
225
|
-
|
265
|
+
g <- g + geom_jitter(aes(col=manufacturer, size=hwy)) +
|
266
|
+
geom_smooth(aes(col=manufacturer), method="lm", se=F)
|
267
|
+
|
226
268
|
```
|
227
269
|
|
228
270
|
### Ruby chunks
|
229
271
|
|
230
|
-
In the same way that an R chunk was created, let's now create a Ruby chunk. One important aspect
|
231
|
-
of Ruby is that in Ruby every evaluation of a chunk occurs on its own local scope, so, creating
|
232
|
-
a variable in a chunk will be out of scope in the next chunk. To make sure that variables are
|
233
|
-
available between chunks, they should be made global.
|
234
272
|
|
235
|
-
|
236
|
-
|
237
|
-
|
238
|
-
|
273
|
+
Including a Ruby chunk is just as easy as including an R chunk in the document: just
|
274
|
+
change the name of the engine to 'ruby'. It is also possible to pass chunk options
|
275
|
+
to the Ruby engine; however, this version does not accept all the options that are
|
276
|
+
available to R chunks. Future versions will add those options.
|
277
|
+
|
278
|
+
````
|
279
|
+
```{ruby first_ruby_chunk}`r ''`
|
280
|
+
```
|
281
|
+
````
|
282
|
+
|
283
|
+
In this example, the ruby chunk is called 'first_ruby_chunk'. One important
|
284
|
+
aspect of chunk labels is that they cannot be duplicated. If a chunk label is
|
285
|
+
duplicated, gKnitting will stop with an error.
|
286
|
+
|
287
|
+
Another relevant point with Ruby chunks is that they are evaluated in the scope
|
288
|
+
of a class called RubyChunk. To make sure that variables are
|
289
|
+
available between chunks, they should be made as instance variables of the
|
290
|
+
RubyChunk class. In the following chunk, variable '\@a', '\@b' and '\@c'
|
291
|
+
are standard Ruby variables and '\@vec' and '\@vec2' are two vectors created
|
292
|
+
by calling the 'c' method on the R module.
|
293
|
+
|
294
|
+
In Galalaaz, the R module allows us to access R functions transparently. It
|
295
|
+
should be clear that there is no requirement in gknit to call or use any R
|
296
|
+
functions. gKnit will knit standard Ruby code, or even general text without
|
297
|
+
any code.
|
239
298
|
|
240
299
|
```{ruby split_data}
|
241
|
-
|
242
|
-
|
243
|
-
|
300
|
+
@a = [1, 2, 3]
|
301
|
+
@b = "US$ 250.000"
|
302
|
+
@c = "The 'outputs' function"
|
244
303
|
|
245
|
-
|
246
|
-
|
304
|
+
@vec = R.c(1, 2, 3)
|
305
|
+
@vec2 = R.c(10, 20, 30)
|
247
306
|
```
|
248
307
|
|
249
|
-
In this next block, variables '
|
308
|
+
In this next block, variables '\@a', '\@vec' and '\@vec2' are used and printed.
|
250
309
|
|
251
310
|
```{ruby split2}
|
252
|
-
puts
|
253
|
-
puts
|
311
|
+
puts @a
|
312
|
+
puts @vec * @vec2
|
254
313
|
```
|
255
314
|
|
315
|
+
Note that @a is a standard Ruby Array and @vec and @vec2 are vectors that behave accordingly,
|
316
|
+
where multiplication works as expected.
|
317
|
+
|
318
|
+
|
256
319
|
### Accessing R from Ruby
|
257
320
|
|
258
321
|
One of the nice aspects of Galaaz on GraalVM, is that variables and functions defined in R, can
|
259
|
-
be easily accessed from Ruby. This next chunk, reads data from R and uses the '
|
322
|
+
be easily accessed from Ruby. This next chunk, reads data from R and uses the 'reduce_sum'
|
260
323
|
function defined previously. To access an R variable from Ruby the '~' function should be
|
261
324
|
applied to the Ruby symbol representing the R variable. Since the R variable is called 'r_vec',
|
262
325
|
in Ruby, the symbol to acess it is ':r_vec' and thus '~:r_vec' retrieves the value of the
|
@@ -269,124 +332,176 @@ puts ~:r_vec
|
|
269
332
|
In order to call an R function, the 'R.' module is used as follows
|
270
333
|
|
271
334
|
```{ruby call_r_func}
|
272
|
-
puts R.
|
335
|
+
puts R.reduce_sum(~:r_vec)
|
273
336
|
```
|
274
337
|
|
275
|
-
###
|
338
|
+
### Ruby Plotting
|
339
|
+
|
340
|
+
We have seen an example of plotting with R. Plotting with Ruby does not require
|
341
|
+
anything different from plotting with R:
|
342
|
+
|
343
|
+
```{ruby diverging_bar, fig.width = 9.1, fig.height = 6.5}
|
344
|
+
require 'ggplot'
|
345
|
+
|
346
|
+
mtcars = ~:mtcars
|
276
347
|
|
277
|
-
|
278
|
-
|
348
|
+
mtcars.car_name = mtcars.rownames # create new column for car names
|
349
|
+
mtcars.mpg_z = ((mtcars.mpg - mtcars.mpg.mean) / mtcars.mpg.sd).round 2
|
350
|
+
mtcars.mpg_type = (mtcars.mpg_z < 0).ifelse('below', 'above')
|
351
|
+
mtcars = mtcars[mtcars.mpg_z.order, :all]
|
352
|
+
mtcars.car_name = R.factor(mtcars.car_name, levels: mtcars.car_name)
|
353
|
+
|
354
|
+
puts mtcars.ggplot(E.aes(x: :car_name, y: :mpg_z, label: :mpg_z)) +
|
355
|
+
R.geom_bar(E.aes(fill: :mpg_type), stat: 'identity', width: 0.5) +
|
356
|
+
R.scale_fill_manual(name: 'Mileage',
|
357
|
+
labels: R.c('Above Average', 'Below Average'),
|
358
|
+
values: R.c('above': '#00ba38', 'below': '#f8766d')) +
|
359
|
+
R.labs(subtitle: "Normalised mileage from 'mtcars'",
|
360
|
+
title: "Diverging Bars") +
|
361
|
+
R.coord_flip
|
279
362
|
```
|
280
|
-
|
281
|
-
|
282
|
-
|
363
|
+
|
364
|
+
### Inline Ruby code
|
365
|
+
|
366
|
+
When using a Ruby chunk, the code and the output are formated in blocks as seen above.
|
367
|
+
This formatting is not always desired. Sometimes, one wants to have the result of the
|
368
|
+
Ruby evalutaion included in the middle of a phrase. gKnit allows adding inline Ruby code
|
369
|
+
with the 'rb' engine. The following text will
|
370
|
+
create and inline Ruby text:
|
283
371
|
|
284
372
|
````
|
285
|
-
This is some text with inline Ruby accessing variable
|
286
|
-
```{rb puts "```{rb puts
|
373
|
+
This is some text with inline Ruby accessing variable \@b which has value:
|
374
|
+
```{rb puts "```{rb puts @b}\n```"}
|
287
375
|
```
|
288
376
|
and is followed by some other text!
|
289
377
|
````
|
290
378
|
|
291
|
-
|
379
|
+
Note that it is important not to add any new line before of after the code
|
380
|
+
block if we want everything to be in only one line, resulting in the following sentence
|
381
|
+
with inline Ruby code
|
292
382
|
|
293
|
-
<div style="margin-bottom:
|
383
|
+
<div style="margin-bottom:30px;">
|
294
384
|
</div>
|
295
385
|
|
296
|
-
This is some text with inline Ruby accessing variable
|
297
|
-
```{rb puts
|
386
|
+
This is some text with inline Ruby accessing variable \@b which has value:
|
387
|
+
```{rb puts @b}
|
298
388
|
```
|
299
389
|
and is followed by some other text!
|
300
390
|
|
301
|
-
<div style="margin-bottom:
|
302
|
-
</div>
|
303
|
-
|
304
|
-
In an inline block, it is possible to execute multiple Ruby statements by adding a semicolon
|
305
|
-
between them:
|
306
|
-
|
307
|
-
````
|
308
|
-
Multiple statements in the 'rb' engine use semicolon:
|
309
|
-
```{rb puts "```{rb puts $a, puts $b}\n```"}
|
310
|
-
```
|
311
|
-
````
|
312
|
-
|
313
|
-
<div style="margin-bottom:50px;">
|
391
|
+
<div style="margin-bottom:30px;">
|
314
392
|
</div>
|
315
393
|
|
316
394
|
|
317
|
-
|
318
|
-
|
395
|
+
```{ruby heading, echo = FALSE}
|
396
|
+
outputs "### #{@c}"
|
319
397
|
```
|
320
398
|
|
321
|
-
|
322
|
-
|
399
|
+
He have previously used the standard 'puts' method in Ruby chunks in order to get some
|
400
|
+
output. As can be seen, the result of a 'puts' is formatted inside a white box that
|
401
|
+
follows the code block. Many times however, we would like to do some processing in the
|
402
|
+
Ruby chunk and have the result of this processing generate and output that is
|
403
|
+
'included' in the document as if we had typed it in R markdown.
|
323
404
|
|
405
|
+
For example, suppose we want to create a new 'heading' in our document, but the heading
|
406
|
+
phrase is the result of some code processing: maybe it's the first line of a file we are
|
407
|
+
going to read. Method 'outputs' adds its output as if typed in the R markdown document.
|
324
408
|
|
325
|
-
|
326
|
-
|
409
|
+
Take now a look at variable '@c' (it was defined in a previous block above) as
|
410
|
+
'@c = "The 'outputs' function". "The 'outputs' function" is actually the name of this
|
411
|
+
section and it was created using the 'outputs' function inside a Ruby chunk.
|
327
412
|
|
328
|
-
|
329
|
-
needs to be returned by the inline Ruby engine. For example the heading above, was created by
|
330
|
-
the following chunk:
|
413
|
+
The ruby chunk to generate this heading is:
|
331
414
|
|
332
415
|
````
|
333
|
-
```{
|
416
|
+
```{ruby heading}`r ''`
|
417
|
+
outputs "### #{@c}"
|
334
418
|
```
|
335
419
|
````
|
336
420
|
|
337
|
-
|
338
|
-
create the section heading for this section.
|
421
|
+
The three '###' are the way we add a Heading 3 in R markdown.
|
339
422
|
|
340
423
|
|
341
|
-
###
|
424
|
+
### HTML Output from Ruby Chunks
|
342
425
|
|
343
|
-
|
344
|
-
|
426
|
+
We've just seen the use of method 'outputs' to add text to the the R markdown
|
427
|
+
document. This technique can also be used to add HTML code to the document. In R
|
428
|
+
markdown any html code typed directly in the document will be properly rendered.
|
429
|
+
Here, for instance, is a table definition in HTML and its output in the document:
|
430
|
+
|
431
|
+
```
|
432
|
+
<table style="width:100%">
|
433
|
+
<tr>
|
434
|
+
<th>Firstname</th>
|
435
|
+
<th>Lastname</th>
|
436
|
+
<th>Age</th>
|
437
|
+
</tr>
|
438
|
+
<tr>
|
439
|
+
<td>Jill</td>
|
440
|
+
<td>Smith</td>
|
441
|
+
<td>50</td>
|
442
|
+
</tr>
|
443
|
+
<tr>
|
444
|
+
<td>Eve</td>
|
445
|
+
<td>Jackson</td>
|
446
|
+
<td>94</td>
|
447
|
+
</tr>
|
448
|
+
</table>
|
449
|
+
```
|
450
|
+
<div style="margin-bottom:30px;">
|
451
|
+
</div>
|
345
452
|
|
346
|
-
|
453
|
+
<table style="width:100%">
|
454
|
+
<tr>
|
455
|
+
<th>Firstname</th>
|
456
|
+
<th>Lastname</th>
|
457
|
+
<th>Age</th>
|
458
|
+
</tr>
|
459
|
+
<tr>
|
460
|
+
<td>Jill</td>
|
461
|
+
<td>Smith</td>
|
462
|
+
<td>50</td>
|
463
|
+
</tr>
|
464
|
+
<tr>
|
465
|
+
<td>Eve</td>
|
466
|
+
<td>Jackson</td>
|
467
|
+
<td>94</td>
|
468
|
+
</tr>
|
469
|
+
</table>
|
470
|
+
|
471
|
+
<div style="margin-bottom:30px;">
|
472
|
+
</div>
|
347
473
|
|
348
|
-
|
349
|
-
|
350
|
-
|
351
|
-
|
352
|
-
|
353
|
-
|
354
|
-
|
355
|
-
|
356
|
-
|
357
|
-
|
358
|
-
# Diverging Barcharts
|
359
|
-
# R.png
|
360
|
-
gg = mtcars.ggplot(E.aes(x: :car_name, y: :mpg_z, label: :mpg_z)) +
|
361
|
-
R.geom_bar(E.aes(fill: :mpg_type), stat: 'identity', width: 0.5) +
|
362
|
-
R.scale_fill_manual(name: "Mileage",
|
363
|
-
labels: R.c("Above Average", "Below Average"),
|
364
|
-
values: R.c("above": "#00ba38", "below": "#f8766d")) +
|
365
|
-
R.labs(subtitle: "Normalised mileage from 'mtcars'",
|
366
|
-
title: "Diverging Bars") +
|
367
|
-
R.coord_flip()
|
368
|
-
print gg
|
369
|
-
# R.dev__off
|
370
|
-
# R.include_graphics("Rplot001.png")
|
474
|
+
But manually creating HTML output is not always easy or desirable. The above
|
475
|
+
table certainly looks ugly. The 'kableExtra' library is a great library for
|
476
|
+
creating beautiful tables. Take a look at https://cran.r-project.org/web/packages/kableExtra/vignettes/awesome_table_in_html.html
|
477
|
+
|
478
|
+
In the next chunk, we output the 'mtcars' dataframe from R in a nicely formatted
|
479
|
+
table. Note that we retrieve the mtcars dataframe by using '~:mtcars'.
|
480
|
+
|
481
|
+
```{ruby nice_table}
|
482
|
+
R.library('kableExtra')
|
483
|
+
outputs (~:mtcars).kable.kable_styling
|
371
484
|
```
|
372
485
|
|
373
486
|
### Including Ruby files
|
374
487
|
|
375
|
-
R is a language that was created to be easy and fast for statisticians to use.
|
488
|
+
R is a language that was created to be easy and fast for statisticians to use. As far
|
489
|
+
as I know (and please correct me if you think otherwise), tt was not a
|
376
490
|
language to be used for developing large systems. Of course, there are large systems and
|
377
491
|
libraries in R, but the focus of the language is for developing statistical models and
|
378
492
|
distribute that to peers.
|
379
493
|
|
380
494
|
Ruby on the other hand, is a language for large software development. Systems written in
|
381
|
-
Ruby will have dozens or
|
495
|
+
Ruby will have dozens, hundreds or even thousands of files. In order to document a
|
496
|
+
large system with
|
382
497
|
literate programming we cannot expect the developer to add all the files in a single '.Rmd'
|
383
498
|
file. gKnit provides the 'include' chunk engine to include a Ruby file as if it had being
|
384
499
|
typed in the '.Rmd' file.
|
385
500
|
|
386
|
-
To include a file the following chunk should be created, where <filename> is the name of
|
501
|
+
To include a file, the following chunk should be created, where <filename> is the name of
|
387
502
|
the file to be include and where the extension, if it is '.rb', does not need to be added.
|
388
503
|
If the 'relative' option is not included, then it is treated as TRUE. When 'relative' is
|
389
|
-
true, 'require_relative' semantics is used to load the file, when false, Ruby's
|
504
|
+
true, 'require_relative' semantics is used to load the file, when false, Ruby's \$LOAD_PATH
|
390
505
|
is searched to find the file and it is 'require'd.
|
391
506
|
|
392
507
|
````
|
@@ -394,8 +509,17 @@ is searched to find the file and it is 'require'd.
|
|
394
509
|
```
|
395
510
|
````
|
396
511
|
|
397
|
-
Here we include file 'model.rb' which is in the same directory of this blog.
|
398
|
-
uses R 'caret' package to split a dataset in a train and test sets.
|
512
|
+
Here we include file 'model.rb' which is in the same directory of this blog.
|
513
|
+
This code uses R 'caret' package to split a dataset in a train and test sets.
|
514
|
+
The 'caret' package is a very important a useful package for doing Data Analysis,
|
515
|
+
it has hundreds of functions for all steps of the Data Analysis workflow. To
|
516
|
+
just split a dataset it is using the proverbial cannon to kill the fly. We use
|
517
|
+
it here only to show that integrating Ruby and R and using even a very comples
|
518
|
+
package as 'caret' is trivial with Galaaz.
|
519
|
+
|
520
|
+
A word of advice: the 'caret' package has lots of dependencies and installing
|
521
|
+
it in a Linux system is a time consuming operation. Method 'R.install_and_loads'
|
522
|
+
will install the package if it is not already installed and can take a while.
|
399
523
|
|
400
524
|
````
|
401
525
|
```{include model}`r ''`
|
@@ -419,15 +543,14 @@ gKnit also allows developers to document and load files that are not in the same
|
|
419
543
|
of the '.Rmd' file. When using 'relative = FALSE' in a chunk header, gKnit will look for the
|
420
544
|
file in Ruby's \$LOAD_PATH and load it if found.
|
421
545
|
|
422
|
-
Here is an example of loading the '
|
546
|
+
Here is an example of loading the 'find.rb' file from TruffleRuby.
|
423
547
|
|
424
548
|
````
|
425
|
-
```{include
|
549
|
+
```{include find, relative = FALSE}`r ''`
|
426
550
|
```
|
427
551
|
````
|
428
552
|
|
429
|
-
|
430
|
-
```{include continuation, relative = FALSE}
|
553
|
+
```{include find, relative = FALSE}
|
431
554
|
```
|
432
555
|
|
433
556
|
## Converting to PDF
|
@@ -497,4 +620,4 @@ the gnu compiler and tools should be enough. I am not sure what is needed on th
|
|
497
620
|
|
498
621
|
## Usage
|
499
622
|
|
500
|
-
* gknit
|
623
|
+
* gknit \<filename\>
|