galaaz 0.4.5 → 0.4.6

Sign up to get free protection for your applications and to get access to all the features.
Files changed (102) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +696 -270
  3. data/Rakefile +9 -22
  4. data/bin/gknit +2 -217
  5. data/bin/gknit_old_r +236 -0
  6. data/bin/grun +5 -0
  7. data/blogs/dev/dev.Rmd +7 -0
  8. data/blogs/dev/dev.html +34 -26
  9. data/blogs/dev/dev.md +40 -25
  10. data/blogs/dev/dev_files/figure-html/bubble-1.png +0 -0
  11. data/blogs/dev/dev_files/figure-html/diverging_bar. +0 -0
  12. data/blogs/dev/dev_files/figure-html/diverging_bar.png +0 -0
  13. data/blogs/galaaz_ggplot/galaaz_ggplot.Rmd +4 -4
  14. data/blogs/galaaz_ggplot/galaaz_ggplot.html +251 -59
  15. data/blogs/galaaz_ggplot/galaaz_ggplot.log +640 -0
  16. data/blogs/galaaz_ggplot/galaaz_ggplot.md +199 -95
  17. data/blogs/galaaz_ggplot/galaaz_ggplot.tex +45 -228
  18. data/blogs/galaaz_ggplot/midwest.png +0 -0
  19. data/blogs/galaaz_ggplot/scatter_plot.png +0 -0
  20. data/blogs/gknit/gknit.Rmd +271 -148
  21. data/blogs/manual/manual.Rmd +212 -0
  22. data/blogs/manual/manual.html +1832 -0
  23. data/blogs/manual/manual.md +751 -0
  24. data/blogs/manual/manual_files/figure-html/diverging_bar.png +0 -0
  25. data/blogs/ruby_plot/ruby_plot.Rmd +5 -69
  26. data/blogs/ruby_plot/ruby_plot.html +195 -236
  27. data/blogs/ruby_plot/ruby_plot.md +1 -261
  28. data/blogs/ruby_plot/ruby_plot_files/figure-html/dose_len.svg +38 -38
  29. data/examples/sthda_ggplot/two_variables_disc_cont/geom_dotplot.rb +5 -5
  30. data/examples/sthda_ggplot/two_variables_disc_cont/geom_jitter.rb +1 -0
  31. data/examples/sthda_ggplot/two_variables_disc_cont/geom_violin.rb +3 -7
  32. data/examples/sthda_ggplot/two_variables_error/geom_crossbar.rb +3 -1
  33. data/lib/R_interface/r.rb +12 -9
  34. data/lib/R_interface/r_methods.rb +2 -2
  35. data/lib/R_interface/rbinary_operators.rb +2 -20
  36. data/lib/R_interface/rdata_frame.rb +56 -9
  37. data/lib/R_interface/rdevices.R +0 -12
  38. data/lib/R_interface/rexpression.rb +0 -97
  39. data/lib/R_interface/rindexed_object.rb +12 -3
  40. data/lib/R_interface/rlanguage.rb +1 -1
  41. data/lib/R_interface/rlist.rb +29 -4
  42. data/lib/R_interface/rlogical_operators.rb +50 -0
  43. data/lib/R_interface/rmatrix.rb +7 -1
  44. data/lib/R_interface/robject.rb +29 -15
  45. data/lib/R_interface/rsupport.rb +74 -58
  46. data/lib/R_interface/rsymbol.rb +2 -1
  47. data/lib/R_interface/ruby_extensions.rb +11 -2
  48. data/lib/R_interface/rvector.rb +26 -11
  49. data/lib/gknit.rb +2 -0
  50. data/lib/gknit/include_engine.rb +57 -0
  51. data/lib/gknit/knitr_engine.rb +596 -50
  52. data/lib/gknit/rb_engine.rb +56 -0
  53. data/lib/gknit/ruby_engine.rb +13 -36
  54. data/lib/util/exec_ruby.rb +132 -21
  55. data/lib/util/inline_file.rb +9 -7
  56. data/specs/all.rb +5 -0
  57. data/specs/figures/bg.jpeg +0 -0
  58. data/specs/figures/bg.png +0 -0
  59. data/specs/figures/bg.svg +57 -0
  60. data/specs/figures/no_args.jpeg +0 -0
  61. data/specs/figures/no_args.png +0 -0
  62. data/specs/figures/no_args.svg +57 -0
  63. data/specs/figures/width_height.jpeg +0 -0
  64. data/specs/figures/width_height.png +0 -0
  65. data/specs/figures/width_height_units1.jpeg +0 -0
  66. data/specs/figures/width_height_units1.png +0 -0
  67. data/specs/figures/width_height_units2.jpeg +0 -0
  68. data/specs/figures/width_height_units2.png +0 -0
  69. data/specs/r_dataframe.spec.rb +29 -27
  70. data/specs/r_devices.spec.rb +347 -0
  71. data/specs/r_eval.spec.rb +10 -3
  72. data/specs/r_formula.spec.rb +2 -2
  73. data/specs/r_language.spec.rb +112 -0
  74. data/specs/r_list.spec.rb +174 -14
  75. data/specs/r_list_apply.spec.rb +17 -10
  76. data/specs/r_matrix.spec.rb +3 -3
  77. data/specs/r_vector_operators.spec.rb +13 -7
  78. data/specs/tmp.rb +42 -12
  79. data/version.rb +1 -1
  80. metadata +28 -24
  81. data/bin/gknit2 +0 -14
  82. data/bin/prepareR.rb +0 -3
  83. data/bin/tmp.py +0 -51
  84. data/blogs/gknit/gknit.html +0 -528
  85. data/blogs/gknit/gknit.md +0 -628
  86. data/blogs/gknit/gknit.pdf +0 -0
  87. data/blogs/gknit/gknit.tex +0 -745
  88. data/blogs/gknit/gknit_files/figure-html/bubble-1.png +0 -0
  89. data/blogs/gknit/gknit_files/figure-html/diverging_bar.png +0 -0
  90. data/blogs/ruby_plot/figures/dose_len.png +0 -0
  91. data/blogs/ruby_plot/figures/facet_by_delivery.png +0 -0
  92. data/blogs/ruby_plot/figures/facet_by_dose.png +0 -0
  93. data/blogs/ruby_plot/figures/facets_by_delivery_color.png +0 -0
  94. data/blogs/ruby_plot/figures/facets_by_delivery_color2.png +0 -0
  95. data/blogs/ruby_plot/figures/facets_with_decorations.png +0 -0
  96. data/blogs/ruby_plot/figures/facets_with_jitter.png +0 -0
  97. data/blogs/ruby_plot/figures/facets_with_points.png +0 -0
  98. data/blogs/ruby_plot/figures/final_box_plot.png +0 -0
  99. data/blogs/ruby_plot/figures/final_violin_plot.png +0 -0
  100. data/blogs/ruby_plot/figures/violin_with_jitter.png +0 -0
  101. data/lib/R/eng_ruby.R +0 -62
  102. data/lib/R_interface/rdevices.rb +0 -225
@@ -1,16 +1,18 @@
1
1
  ---
2
- title: "gKnit - Ruby and R Knitting with Galaaz in GraalVM"
3
- author: "Rodrigo Botafogo"
4
- tags: [Galaaz, Ruby, R, TruffleRuby, FastR, GraalVM, knitr]
5
- date: "19 October 2018"
2
+ title: "How to do reproducible research in Ruby with gKnit"
3
+ author:
4
+ - "Rodrigo Botafogo"
5
+ - "Daniel Mossé - University of Pittsburgh"
6
+ tags: [Tech, Data Science, Ruby, R, GraalVM]
7
+ date: "20/02/2019"
6
8
  output:
7
- html_document:
8
- self_contained: true
9
- keep_md: true
10
9
  pdf_document:
11
10
  includes:
12
11
  in_header: ["../../sty/galaaz.sty"]
13
12
  number_sections: yes
13
+ html_document:
14
+ self_contained: true
15
+ keep_md: true
14
16
  ---
15
17
 
16
18
  ```{r setup, echo=FALSE}
@@ -21,7 +23,8 @@ output:
21
23
 
22
24
  The idea of "literate programming" was first introduced by Donald Knuth in the 1980's.
23
25
  The main intention of this approach was to develop software interspersing macro snippets,
24
- traditional source code, and a natural language such as English that could be compiled into
26
+ traditional source code, and a natural language such as English in a document
27
+ that could be compiled into
25
28
  executable code and at the same time easily read by a human developer. According to Knuth
26
29
  "The practitioner of
27
30
  literate programming can be regarded as an essayist, whose main concern is with exposition
@@ -41,11 +44,11 @@ contained the whole narrative to reproduce the research. But Sweave had many pr
41
44
  problems from Sweave and including in one single package many extensions and add-on packages that
42
45
  were necessary for Sweave.
43
46
 
44
- With Knitr, R markdown was also developed, an extension the the
47
+ With Knitr, R markdown was also developed, an extension to the
45
48
  Markdown format. With R markdown and Knitr it is possible to generate reports in a multitude
46
49
  of formats such as HTML, markdown, Latex, PDF, dvi, etc. R markdown also allows the use of
47
50
  multiple programming languages in the same document. In R markdown text is interspersed with
48
- code chunks that can be executed and both the code as the result of executing the code can become
51
+ code chunks that can be executed and both the code and its results can become
49
52
  part of the final report. Although R markdown allows multiple programming languages in the
50
53
  same document, only R and Python (with
51
54
  the reticulate package) can persist variables between chunks. For other languages, such as
@@ -54,49 +57,64 @@ is somehow stored in a data file that is read by the next chunk.
54
57
 
55
58
  Being able to persist data
56
59
  between chunks is critical for literate programming otherwise the flow of the narrative is lost
57
- by all the effort of having to save data and then reload it. Probably, because of this impossibility,
58
- it is very rare to see any R markdown document document in the Ruby community.
60
+ by all the effort of having to save data and then reload it. Probably, because of
61
+ this impossibility,
62
+ it is very rare to see any R markdown document in the Ruby community. Also, the use of
63
+ R markdown for the Ruby community would also require the Ruby developer to download R and
64
+ have some minimal knowledge of Knitr.
59
65
 
60
66
  In the Python community, the same effort to have code and text in an integrated environment
61
- started also on the first decade of 2000. In 2006 iPython 0.7.2 was released. In 2014,
67
+ started around the first decade of 2000. In 2006 iPython 0.7.2 was released. In 2014,
62
68
  Fernando Pérez, spun off project Jupyter from iPython creating a web-based interactive
63
69
  computation environment. Jupyter can now be used with many languages, including Ruby with the
64
70
  iruby gem (https://github.com/SciRuby/iruby). I am not sure if multiple languages can be used
65
- in a Jupyter notebook.
71
+ in a Jupyter notebook and if variables can persist between chunks.
66
72
 
67
73
  # gKnitting a Document
68
74
 
69
75
  This document describes gKnit. gKnit uses Knitr and R markdown to knit a document in Ruby or R
70
- and output it in any of the
71
- available formats for R markdown. The only difference between gKnit and normal Knitr documents
72
- is that gKnit runs atop of GraalVM, and Galaaz (an integration library between Ruby and R).
73
- Another blog post on Galaaz and its integration with ggplot2 can be found at:
74
- https://towardsdatascience.com/ruby-plotting-with-galaaz-an-example-of-tightly-coupling-ruby-and-r-in-graalvm-520b69e21021. With Galaaz, gKnit can knit documents in Ruby and R and both
75
- Ruby and R execute on the same process and memory, variables, classes, etc.
76
- will be preserved between chunks of code.
76
+ and output it in any of the available formats for R markdown.
77
+ gKnit runs atop of GraalVM, and Galaaz (an integration
78
+ library between Ruby and R). In gKnit, Ruby variables are persisted between chunks, making
79
+ it an ideal solution for literate programming in this language. Also, since it is based on
80
+ Galaaz, Ruby chunks can have access to R variables and Polyglot Programming with Ruby and R
81
+ is quite natural.
82
+
83
+ Galaaz has been describe already in the following posts:
77
84
 
78
- This is not a blog post on rmarkdown, and the interested user is directed to
85
+ * https://towardsdatascience.com/ruby-plotting-with-galaaz-an-example-of-tightly-coupling-ruby-and-r-in-graalvm-520b69e21021.
86
+ * https://medium.freecodecamp.org/how-to-make-beautiful-ruby-plots-with-galaaz-320848058857
87
+
88
+ This is not a blog post on R markdown, and the interested user is directed to the following links
89
+ for detailed information on its capabilities and use.
79
90
 
80
91
  * https://rmarkdown.rstudio.com/ or
81
- * https://bookdown.org/yihui/rmarkdown/ for detailed information on its capabilities and use.
92
+ * https://bookdown.org/yihui/rmarkdown/
82
93
 
83
94
  Here, we will describe quickly the main aspects of R markdown, so the user can start gKnitting
84
95
  Ruby and R documents quickly.
85
96
 
86
97
  ## The Yaml header
87
98
 
88
- An R markdown document should start with a Yaml header and be stored in a file with '.Rmd' extension.
89
- This document has the following header for gKitting an HTML document.
99
+ An R markdown document should start with a Yaml header and be stored in a file with
100
+ '.Rmd' extension. This document has the following header for gKitting an HTML document.
90
101
 
91
102
  ```
92
103
  ---
93
- title: "gKnit - Ruby and R Knitting with Galaaz in GraalVM"
94
- author: "Rodrigo Botafogo"
95
- tags: [Galaaz, Ruby, R, TruffleRuby, FastR, GraalVM, knitr, gknit]
96
- date: "29 October 2018"
104
+ title: "How to do reproducible research in Ruby with gKnit"
105
+ author:
106
+ - "Rodrigo Botafogo"
107
+ - "Daniel Mossé - University of Pittsburgh"
108
+ tags: [Tech, Data Science, Ruby, R, GraalVM]
109
+ date: "20/02/2019"
97
110
  output:
98
111
  html_document:
112
+ self_contained: true
99
113
  keep_md: true
114
+ pdf_document:
115
+ includes:
116
+ in_header: ["../../sty/galaaz.sty"]
117
+ number_sections: yes
100
118
  ---
101
119
  ```
102
120
 
@@ -140,62 +158,80 @@ Ordered Lists
140
158
 
141
159
  Please, go to https://rmarkdown.rstudio.com/authoring_basics.html, for more R markdown formatting.
142
160
 
143
- ## Code Chunks
161
+ ### R chunks
144
162
 
145
- Running and executing Ruby and R code is actually what really interests us is this blog. Inserting
146
- a code chunk is done by adding code in a block delimited by three back ticks followed by a
147
- block with the engine name (r, ruby, rb, include, others), an optional chunk_label and optional
148
- options, as shown bellow:
163
+ Running and executing Ruby and R code is actually what really interests us is this blog.
164
+ Inserting a code chunk is done by adding code in a block delimited by three back ticks
165
+ followed by an open
166
+ curly brace ('{') followed with the engine name (r, ruby, rb, include, ...), an
167
+ any optional chunk_label and options, as shown bellow:
149
168
 
150
169
  ````
151
170
  ```{engine_name [chunk_label], [chunk_options]}`r ''`
152
171
  ```
153
172
  ````
154
173
 
155
- for instance, let's add an R chunk to the document labeled 'first_r_chunk'. In this case, the
156
- code should not be shown in the document, so the option 'echo=FALSE' was added.
174
+ for instance, let's add an R chunk to the document labeled 'first_r_chunk'. This is
175
+ a very simple code just to create a variable and print it out. The code block should
176
+ be defined as follows:
157
177
 
158
178
  ````
159
- ```{r first_r_chunk, echo = FALSE}`r ''`
179
+ ```{r first_r_chunk}`r ''`
180
+ vec <- c(1, 2, 3)
181
+ print(vec)
160
182
  ```
161
183
  ````
162
184
 
163
- A description of the available chunk options can be found in the documentation cited above.
185
+ If this block is added to an R markdown document and gKnitted the result will be:
186
+
187
+ ```{r first_r_chunk}
188
+ vec <- c(1, 2, 3)
189
+ print(vec)
190
+ ```
164
191
 
165
- For including a Ruby chunk, just change the name of the engine to ruby as follows:
192
+ Now let's say that we want to do some analysis in the code, but just print the result and not the
193
+ code itself. For this, we need to add the option 'echo = FALSE'.
166
194
 
167
195
  ````
168
- ```{ruby first_ruby_chunk}`r ''`
196
+ ```{r second_r_chunk, echo = FALSE}`r ''`
197
+ vec2 <- c(10, 20, 30)
198
+ vec3 <- vec * vec2
199
+ print(vec3)
169
200
  ```
170
201
  ````
202
+ Here is how this block will show up in the document. Observe that the code is not shown
203
+ and we only see the execution result in a white box
171
204
 
172
- In this example, the ruby chunk is called 'first_ruby_chunk'. One important aspect of chunk
173
- labels is that they cannot be duplicate. If a chunk label is duplicate, the knitting will
174
- stop with an error.
205
+ ```{r second_r_chunk, echo = FALSE}
206
+ vec2 <- c(10, 20, 30)
207
+ vec3 <- vec * vec2
208
+ print(vec3)
209
+ ```
175
210
 
176
- ### R chunks
211
+ A description of the available chunk options can be found in the documentation cited above.
177
212
 
178
- Let's now add an R chunk to this document. In this example, a vector 'r_vec' is created and
179
- a new function 'redef_sum' is defined. The chunk specification is
213
+ Let's add another R chunkd with a function definition. In this example, a vector
214
+ 'r_vec' is created and
215
+ a new function 'reduce_sum' is defined. The chunk specification is
180
216
 
181
217
  ````
182
218
  ```{r data_creation}`r ''`
183
219
  r_vec <- c(1, 2, 3, 4, 5)
184
220
 
185
- redef_sum <- function(...) {
221
+ reduce_sum <- function(...) {
186
222
  Reduce(sum, as.list(...))
187
223
  }
188
224
  ```
189
225
  ````
190
226
 
191
- and this is how it will look like once executed. From now on, we will not show the chunk
192
- definition any longer.
227
+ and this is how it will look like once executed. From now on, we will not
228
+ show the chunk definition any longer.
193
229
 
194
230
 
195
231
  ```{r data_creation}
196
232
  r_vec <- c(1, 2, 3, 4, 5)
197
233
 
198
- redef_sum <- function(...) {
234
+ reduce_sum <- function(...) {
199
235
  Reduce(sum, as.list(...))
200
236
  }
201
237
  ```
@@ -204,14 +240,19 @@ We can, possibly in another chunk, access the vector and call the function as fo
204
240
 
205
241
  ```{r using_previous}
206
242
  print(r_vec)
207
- print(redef_sum(r_vec))
243
+ print(reduce_sum(r_vec))
208
244
  ```
245
+ ### R Graphics with ggplot
209
246
 
210
- ```{r bubble}
247
+ In the following chunk, we create a bubble chart in R using ggplot and include it in
248
+ this document. Note that there is no directive in the code to include the image, this
249
+ occurs automatically. The 'mpg' dataframe is natively available to R and to Galaaz as
250
+ well.
251
+
252
+ ```{r bubble, dev='png'}
211
253
  # load package and data
212
254
  library(ggplot2)
213
255
  data(mpg, package="ggplot2")
214
- # mpg <- read.csv("http://goo.gl/uEeRGu")
215
256
 
216
257
  mpg_select <- mpg[mpg$manufacturer %in% c("audi", "ford", "honda", "hyundai"), ]
217
258
 
@@ -221,42 +262,64 @@ g <- ggplot(mpg_select, aes(displ, cty)) +
221
262
  labs(subtitle="mpg: Displacement vs City Mileage",
222
263
  title="Bubble chart")
223
264
 
224
- g + geom_jitter(aes(col=manufacturer, size=hwy)) +
225
- geom_smooth(aes(col=manufacturer), method="lm", se=F)
265
+ g <- g + geom_jitter(aes(col=manufacturer, size=hwy)) +
266
+ geom_smooth(aes(col=manufacturer), method="lm", se=F)
267
+
226
268
  ```
227
269
 
228
270
  ### Ruby chunks
229
271
 
230
- In the same way that an R chunk was created, let's now create a Ruby chunk. One important aspect
231
- of Ruby is that in Ruby every evaluation of a chunk occurs on its own local scope, so, creating
232
- a variable in a chunk will be out of scope in the next chunk. To make sure that variables are
233
- available between chunks, they should be made global.
234
272
 
235
- In this chunk, variable '\$a', '\$b' and '\$c' are standard Ruby variables and '\$vec' and '\$vec2'
236
- are two vectors created by a call to FastR. It should be clear that there is no requirement
237
- in gknit to call or use R functions. gKnit will knit standard Ruby code, or even general
238
- text without code.
273
+ Including a Ruby chunk is just as easy as including an R chunk in the document: just
274
+ change the name of the engine to 'ruby'. It is also possible to pass chunk options
275
+ to the Ruby engine; however, this version does not accept all the options that are
276
+ available to R chunks. Future versions will add those options.
277
+
278
+ ````
279
+ ```{ruby first_ruby_chunk}`r ''`
280
+ ```
281
+ ````
282
+
283
+ In this example, the ruby chunk is called 'first_ruby_chunk'. One important
284
+ aspect of chunk labels is that they cannot be duplicated. If a chunk label is
285
+ duplicated, gKnitting will stop with an error.
286
+
287
+ Another relevant point with Ruby chunks is that they are evaluated in the scope
288
+ of a class called RubyChunk. To make sure that variables are
289
+ available between chunks, they should be made as instance variables of the
290
+ RubyChunk class. In the following chunk, variable '\@a', '\@b' and '\@c'
291
+ are standard Ruby variables and '\@vec' and '\@vec2' are two vectors created
292
+ by calling the 'c' method on the R module.
293
+
294
+ In Galalaaz, the R module allows us to access R functions transparently. It
295
+ should be clear that there is no requirement in gknit to call or use any R
296
+ functions. gKnit will knit standard Ruby code, or even general text without
297
+ any code.
239
298
 
240
299
  ```{ruby split_data}
241
- $a = [1, 2, 3]
242
- $b = "US$ 250.000"
243
- $c = "Inline text in a Heading"
300
+ @a = [1, 2, 3]
301
+ @b = "US$ 250.000"
302
+ @c = "The 'outputs' function"
244
303
 
245
- $vec = R.c(1, 2, 3)
246
- $vec2 = R.c(10, 20, 30)
304
+ @vec = R.c(1, 2, 3)
305
+ @vec2 = R.c(10, 20, 30)
247
306
  ```
248
307
 
249
- In this next block, variables '\$a', '\$vec' and '\$vec2' are used and printed.
308
+ In this next block, variables '\@a', '\@vec' and '\@vec2' are used and printed.
250
309
 
251
310
  ```{ruby split2}
252
- puts $a
253
- puts $vec * $vec2
311
+ puts @a
312
+ puts @vec * @vec2
254
313
  ```
255
314
 
315
+ Note that @a is a standard Ruby Array and @vec and @vec2 are vectors that behave accordingly,
316
+ where multiplication works as expected.
317
+
318
+
256
319
  ### Accessing R from Ruby
257
320
 
258
321
  One of the nice aspects of Galaaz on GraalVM, is that variables and functions defined in R, can
259
- be easily accessed from Ruby. This next chunk, reads data from R and uses the 'redef_fun'
322
+ be easily accessed from Ruby. This next chunk, reads data from R and uses the 'reduce_sum'
260
323
  function defined previously. To access an R variable from Ruby the '~' function should be
261
324
  applied to the Ruby symbol representing the R variable. Since the R variable is called 'r_vec',
262
325
  in Ruby, the symbol to acess it is ':r_vec' and thus '~:r_vec' retrieves the value of the
@@ -269,124 +332,176 @@ puts ~:r_vec
269
332
  In order to call an R function, the 'R.' module is used as follows
270
333
 
271
334
  ```{ruby call_r_func}
272
- puts R.redef_sum($vec)
335
+ puts R.reduce_sum(~:r_vec)
273
336
  ```
274
337
 
275
- ### Inline Ruby code
338
+ ### Ruby Plotting
339
+
340
+ We have seen an example of plotting with R. Plotting with Ruby does not require
341
+ anything different from plotting with R:
342
+
343
+ ```{ruby diverging_bar, fig.width = 9.1, fig.height = 6.5}
344
+ require 'ggplot'
345
+
346
+ mtcars = ~:mtcars
276
347
 
277
- Knitr allows inserting R inline by adding
278
- ```{rb puts "&#96;r code&#96;"}
348
+ mtcars.car_name = mtcars.rownames # create new column for car names
349
+ mtcars.mpg_z = ((mtcars.mpg - mtcars.mpg.mean) / mtcars.mpg.sd).round 2
350
+ mtcars.mpg_type = (mtcars.mpg_z < 0).ifelse('below', 'above')
351
+ mtcars = mtcars[mtcars.mpg_z.order, :all]
352
+ mtcars.car_name = R.factor(mtcars.car_name, levels: mtcars.car_name)
353
+
354
+ puts mtcars.ggplot(E.aes(x: :car_name, y: :mpg_z, label: :mpg_z)) +
355
+ R.geom_bar(E.aes(fill: :mpg_type), stat: 'identity', width: 0.5) +
356
+ R.scale_fill_manual(name: 'Mileage',
357
+ labels: R.c('Above Average', 'Below Average'),
358
+ values: R.c('above': '#00ba38', 'below': '#f8766d')) +
359
+ R.labs(subtitle: "Normalised mileage from 'mtcars'",
360
+ title: "Diverging Bars") +
361
+ R.coord_flip
279
362
  ```
280
- . Unfortunately, this is not possible with Ruby code as there is no provision in knitr for
281
- adding this kind of inline engine. However, gKnit allows adding inline Ruby code with the
282
- 'rb' engine. The following text will create and inline Ruby text:
363
+
364
+ ### Inline Ruby code
365
+
366
+ When using a Ruby chunk, the code and the output are formated in blocks as seen above.
367
+ This formatting is not always desired. Sometimes, one wants to have the result of the
368
+ Ruby evalutaion included in the middle of a phrase. gKnit allows adding inline Ruby code
369
+ with the 'rb' engine. The following text will
370
+ create and inline Ruby text:
283
371
 
284
372
  ````
285
- This is some text with inline Ruby accessing variable \$b which has value:
286
- ```{rb puts "```{rb puts $b}\n```"}
373
+ This is some text with inline Ruby accessing variable \@b which has value:
374
+ ```{rb puts "```{rb puts @b}\n```"}
287
375
  ```
288
376
  and is followed by some other text!
289
377
  ````
290
378
 
291
- The result of executing the above chunk is the following sentence with inline Ruby code
379
+ Note that it is important not to add any new line before of after the code
380
+ block if we want everything to be in only one line, resulting in the following sentence
381
+ with inline Ruby code
292
382
 
293
- <div style="margin-bottom:50px;">
383
+ <div style="margin-bottom:30px;">
294
384
  </div>
295
385
 
296
- This is some text with inline Ruby accessing variable \$b which has value:
297
- ```{rb puts $b}
386
+ This is some text with inline Ruby accessing variable \@b which has value:
387
+ ```{rb puts @b}
298
388
  ```
299
389
  and is followed by some other text!
300
390
 
301
- <div style="margin-bottom:50px;">
302
- </div>
303
-
304
- In an inline block, it is possible to execute multiple Ruby statements by adding a semicolon
305
- between them:
306
-
307
- ````
308
- Multiple statements in the 'rb' engine use semicolon:
309
- ```{rb puts "```{rb puts $a, puts $b}\n```"}
310
- ```
311
- ````
312
-
313
- <div style="margin-bottom:50px;">
391
+ <div style="margin-bottom:30px;">
314
392
  </div>
315
393
 
316
394
 
317
- Multiple statements in the 'rb' engine use semicolon:
318
- ```{rb puts $a; puts $b}
395
+ ```{ruby heading, echo = FALSE}
396
+ outputs "### #{@c}"
319
397
  ```
320
398
 
321
- <div style="margin-bottom:50px;">
322
- </div>
399
+ He have previously used the standard 'puts' method in Ruby chunks in order to get some
400
+ output. As can be seen, the result of a 'puts' is formatted inside a white box that
401
+ follows the code block. Many times however, we would like to do some processing in the
402
+ Ruby chunk and have the result of this processing generate and output that is
403
+ 'included' in the document as if we had typed it in R markdown.
323
404
 
405
+ For example, suppose we want to create a new 'heading' in our document, but the heading
406
+ phrase is the result of some code processing: maybe it's the first line of a file we are
407
+ going to read. Method 'outputs' adds its output as if typed in the R markdown document.
324
408
 
325
- ```{rb puts "### #{$c}"}
326
- ```
409
+ Take now a look at variable '@c' (it was defined in a previous block above) as
410
+ '@c = "The 'outputs' function". "The 'outputs' function" is actually the name of this
411
+ section and it was created using the 'outputs' function inside a Ruby chunk.
327
412
 
328
- Sometimes one wants to add an inline text in a heading. To do that in Ruby the whole heading
329
- needs to be returned by the inline Ruby engine. For example the heading above, was created by
330
- the following chunk:
413
+ The ruby chunk to generate this heading is:
331
414
 
332
415
  ````
333
- ```{rb puts %q(```{rb puts "### #{$c}"}\n```)}
416
+ ```{ruby heading}`r ''`
417
+ outputs "### #{@c}"
334
418
  ```
335
419
  ````
336
420
 
337
- Remember that variable '$\c' was defined in a previous Ruby chunk and is now being used to
338
- create the section heading for this section.
421
+ The three '###' are the way we add a Heading 3 in R markdown.
339
422
 
340
423
 
341
- ### Plotting
424
+ ### HTML Output from Ruby Chunks
342
425
 
343
- ```{ruby diverging_bar}
344
- require 'ggplot'
426
+ We've just seen the use of method 'outputs' to add text to the the R markdown
427
+ document. This technique can also be used to add HTML code to the document. In R
428
+ markdown any html code typed directly in the document will be properly rendered.
429
+ Here, for instance, is a table definition in HTML and its output in the document:
430
+
431
+ ```
432
+ <table style="width:100%">
433
+ <tr>
434
+ <th>Firstname</th>
435
+ <th>Lastname</th>
436
+ <th>Age</th>
437
+ </tr>
438
+ <tr>
439
+ <td>Jill</td>
440
+ <td>Smith</td>
441
+ <td>50</td>
442
+ </tr>
443
+ <tr>
444
+ <td>Eve</td>
445
+ <td>Jackson</td>
446
+ <td>94</td>
447
+ </tr>
448
+ </table>
449
+ ```
450
+ <div style="margin-bottom:30px;">
451
+ </div>
345
452
 
346
- R.theme_set R.theme_bw
453
+ <table style="width:100%">
454
+ <tr>
455
+ <th>Firstname</th>
456
+ <th>Lastname</th>
457
+ <th>Age</th>
458
+ </tr>
459
+ <tr>
460
+ <td>Jill</td>
461
+ <td>Smith</td>
462
+ <td>50</td>
463
+ </tr>
464
+ <tr>
465
+ <td>Eve</td>
466
+ <td>Jackson</td>
467
+ <td>94</td>
468
+ </tr>
469
+ </table>
470
+
471
+ <div style="margin-bottom:30px;">
472
+ </div>
347
473
 
348
- # Data Prep
349
- mtcars = ~:mtcars
350
- mtcars.car_name = R.rownames(:mtcars)
351
- # compute normalized mpg
352
- mtcars.mpg_z = ((mtcars.mpg - mtcars.mpg.mean)/mtcars.mpg.sd).round 2
353
- mtcars.mpg_type = mtcars.mpg_z < 0 ? "below" : "above"
354
- mtcars = mtcars[mtcars.mpg_z.order, :all]
355
- # convert to factor to retain sorted order in plot
356
- mtcars.car_name = mtcars.car_name.factor levels: mtcars.car_name
357
-
358
- # Diverging Barcharts
359
- # R.png
360
- gg = mtcars.ggplot(E.aes(x: :car_name, y: :mpg_z, label: :mpg_z)) +
361
- R.geom_bar(E.aes(fill: :mpg_type), stat: 'identity', width: 0.5) +
362
- R.scale_fill_manual(name: "Mileage",
363
- labels: R.c("Above Average", "Below Average"),
364
- values: R.c("above": "#00ba38", "below": "#f8766d")) +
365
- R.labs(subtitle: "Normalised mileage from 'mtcars'",
366
- title: "Diverging Bars") +
367
- R.coord_flip()
368
- print gg
369
- # R.dev__off
370
- # R.include_graphics("Rplot001.png")
474
+ But manually creating HTML output is not always easy or desirable. The above
475
+ table certainly looks ugly. The 'kableExtra' library is a great library for
476
+ creating beautiful tables. Take a look at https://cran.r-project.org/web/packages/kableExtra/vignettes/awesome_table_in_html.html
477
+
478
+ In the next chunk, we output the 'mtcars' dataframe from R in a nicely formatted
479
+ table. Note that we retrieve the mtcars dataframe by using '~:mtcars'.
480
+
481
+ ```{ruby nice_table}
482
+ R.library('kableExtra')
483
+ outputs (~:mtcars).kable.kable_styling
371
484
  ```
372
485
 
373
486
  ### Including Ruby files
374
487
 
375
- R is a language that was created to be easy and fast for statisticians to use. It was not a
488
+ R is a language that was created to be easy and fast for statisticians to use. As far
489
+ as I know (and please correct me if you think otherwise), tt was not a
376
490
  language to be used for developing large systems. Of course, there are large systems and
377
491
  libraries in R, but the focus of the language is for developing statistical models and
378
492
  distribute that to peers.
379
493
 
380
494
  Ruby on the other hand, is a language for large software development. Systems written in
381
- Ruby will have dozens or hundreds of files. In order to document a large system with
495
+ Ruby will have dozens, hundreds or even thousands of files. In order to document a
496
+ large system with
382
497
  literate programming we cannot expect the developer to add all the files in a single '.Rmd'
383
498
  file. gKnit provides the 'include' chunk engine to include a Ruby file as if it had being
384
499
  typed in the '.Rmd' file.
385
500
 
386
- To include a file the following chunk should be created, where <filename> is the name of
501
+ To include a file, the following chunk should be created, where <filename> is the name of
387
502
  the file to be include and where the extension, if it is '.rb', does not need to be added.
388
503
  If the 'relative' option is not included, then it is treated as TRUE. When 'relative' is
389
- true, 'require_relative' semantics is used to load the file, when false, Ruby's $LOAD_PATH
504
+ true, 'require_relative' semantics is used to load the file, when false, Ruby's \$LOAD_PATH
390
505
  is searched to find the file and it is 'require'd.
391
506
 
392
507
  ````
@@ -394,8 +509,17 @@ is searched to find the file and it is 'require'd.
394
509
  ```
395
510
  ````
396
511
 
397
- Here we include file 'model.rb' which is in the same directory of this blog. This code
398
- uses R 'caret' package to split a dataset in a train and test sets.
512
+ Here we include file 'model.rb' which is in the same directory of this blog.
513
+ This code uses R 'caret' package to split a dataset in a train and test sets.
514
+ The 'caret' package is a very important a useful package for doing Data Analysis,
515
+ it has hundreds of functions for all steps of the Data Analysis workflow. To
516
+ just split a dataset it is using the proverbial cannon to kill the fly. We use
517
+ it here only to show that integrating Ruby and R and using even a very comples
518
+ package as 'caret' is trivial with Galaaz.
519
+
520
+ A word of advice: the 'caret' package has lots of dependencies and installing
521
+ it in a Linux system is a time consuming operation. Method 'R.install_and_loads'
522
+ will install the package if it is not already installed and can take a while.
399
523
 
400
524
  ````
401
525
  ```{include model}`r ''`
@@ -419,15 +543,14 @@ gKnit also allows developers to document and load files that are not in the same
419
543
  of the '.Rmd' file. When using 'relative = FALSE' in a chunk header, gKnit will look for the
420
544
  file in Ruby's \$LOAD_PATH and load it if found.
421
545
 
422
- Here is an example of loading the 'continuation.rb' file from TruffleRuby.
546
+ Here is an example of loading the 'find.rb' file from TruffleRuby.
423
547
 
424
548
  ````
425
- ```{include continuation, relative = FALSE}`r ''`
549
+ ```{include find, relative = FALSE}`r ''`
426
550
  ```
427
551
  ````
428
552
 
429
-
430
- ```{include continuation, relative = FALSE}
553
+ ```{include find, relative = FALSE}
431
554
  ```
432
555
 
433
556
  ## Converting to PDF
@@ -497,4 +620,4 @@ the gnu compiler and tools should be enough. I am not sure what is needed on th
497
620
 
498
621
  ## Usage
499
622
 
500
- * gknit <filename>
623
+ * gknit \<filename\>