galaaz 0.4.9 → 0.4.10

Sign up to get free protection for your applications and to get access to all the features.
Files changed (76) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +798 -285
  3. data/blogs/galaaz_ggplot/galaaz_ggplot.Rmd +3 -12
  4. data/blogs/galaaz_ggplot/galaaz_ggplot.aux +5 -7
  5. data/blogs/galaaz_ggplot/galaaz_ggplot.html +69 -29
  6. data/blogs/galaaz_ggplot/galaaz_ggplot.pdf +0 -0
  7. data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-html/midwest_rb.png +0 -0
  8. data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-html/scatter_plot_rb.png +0 -0
  9. data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-latex/midwest_rb.pdf +0 -0
  10. data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-latex/scatter_plot_rb.pdf +0 -0
  11. data/blogs/galaaz_ggplot/midwest.Rmd +1 -9
  12. data/blogs/gknit/gknit.Rmd +37 -40
  13. data/blogs/gknit/gknit.html +32 -30
  14. data/blogs/gknit/gknit.md +36 -37
  15. data/blogs/gknit/gknit.pdf +0 -0
  16. data/blogs/gknit/gknit.tex +35 -37
  17. data/blogs/manual/manual.Rmd +548 -125
  18. data/blogs/manual/manual.html +509 -286
  19. data/blogs/manual/manual.md +798 -285
  20. data/blogs/manual/manual.pdf +0 -0
  21. data/blogs/manual/manual.tex +2816 -0
  22. data/blogs/manual/manual_files/figure-latex/diverging_bar.pdf +0 -0
  23. data/blogs/nse_dplyr/nse_dplyr.Rmd +240 -74
  24. data/blogs/nse_dplyr/nse_dplyr.html +191 -87
  25. data/blogs/nse_dplyr/nse_dplyr.md +361 -107
  26. data/blogs/nse_dplyr/nse_dplyr.pdf +0 -0
  27. data/blogs/nse_dplyr/nse_dplyr.tex +1373 -0
  28. data/blogs/ruby_plot/ruby_plot.Rmd +61 -81
  29. data/blogs/ruby_plot/ruby_plot.html +54 -57
  30. data/blogs/ruby_plot/ruby_plot.md +48 -67
  31. data/blogs/ruby_plot/ruby_plot.pdf +0 -0
  32. data/blogs/ruby_plot/ruby_plot_files/figure-html/dose_len.png +0 -0
  33. data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_delivery.png +0 -0
  34. data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_dose.png +0 -0
  35. data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color.png +0 -0
  36. data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color2.png +0 -0
  37. data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_jitter.png +0 -0
  38. data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_points.png +0 -0
  39. data/blogs/ruby_plot/ruby_plot_files/figure-html/final_box_plot.png +0 -0
  40. data/blogs/ruby_plot/ruby_plot_files/figure-html/final_violin_plot.png +0 -0
  41. data/blogs/ruby_plot/ruby_plot_files/figure-html/violin_with_jitter.png +0 -0
  42. data/blogs/ruby_plot/ruby_plot_files/figure-latex/dose_len.png +0 -0
  43. data/blogs/ruby_plot/ruby_plot_files/figure-latex/facet_by_delivery.png +0 -0
  44. data/blogs/ruby_plot/ruby_plot_files/figure-latex/facet_by_dose.png +0 -0
  45. data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_by_delivery_color.png +0 -0
  46. data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_by_delivery_color2.png +0 -0
  47. data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_with_decorations.png +0 -0
  48. data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_with_jitter.png +0 -0
  49. data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_with_points.png +0 -0
  50. data/blogs/ruby_plot/ruby_plot_files/figure-latex/final_box_plot.png +0 -0
  51. data/blogs/ruby_plot/ruby_plot_files/figure-latex/final_violin_plot.png +0 -0
  52. data/blogs/ruby_plot/ruby_plot_files/figure-latex/violin_with_jitter.png +0 -0
  53. data/lib/R_interface/rdata_frame.rb +0 -12
  54. data/lib/R_interface/robject.rb +14 -14
  55. data/lib/R_interface/ruby_extensions.rb +3 -31
  56. data/lib/R_interface/rvector.rb +0 -12
  57. data/lib/gknit/knitr_engine.rb +5 -3
  58. data/lib/util/exec_ruby.rb +22 -61
  59. data/specs/tmp.rb +26 -12
  60. data/version.rb +1 -1
  61. metadata +22 -17
  62. data/bin/gknit_old_r +0 -236
  63. data/blogs/dev/dev.Rmd +0 -23
  64. data/blogs/dev/dev.md +0 -58
  65. data/blogs/dev/dev2.Rmd +0 -65
  66. data/blogs/dev/model.rb +0 -41
  67. data/blogs/dplyr/dplyr.Rmd +0 -29
  68. data/blogs/dplyr/dplyr.html +0 -433
  69. data/blogs/dplyr/dplyr.md +0 -58
  70. data/blogs/dplyr/dplyr.rb +0 -63
  71. data/blogs/galaaz_ggplot/galaaz_ggplot.log +0 -640
  72. data/blogs/galaaz_ggplot/galaaz_ggplot.md +0 -431
  73. data/blogs/galaaz_ggplot/galaaz_ggplot.tex +0 -481
  74. data/blogs/galaaz_ggplot/midwest.png +0 -0
  75. data/blogs/galaaz_ggplot/scatter_plot.png +0 -0
  76. data/blogs/ruby_plot/ruby_plot.tex +0 -1077
@@ -1,431 +0,0 @@
1
- ---
2
- title: "Ruby Plotting with Galaaz"
3
- subtitle: "An example of tightly coupling Ruby and R in GraalVM"
4
- author: "Rodrigo Botafogo"
5
- tags: [Galaaz, Ruby, R, TruffleRuby, FastR, GraalVM, ggplot2]
6
- date: "16 October 2018"
7
- output:
8
- html_document:
9
- self_contained: true
10
- keep_md: true
11
- pdf_document:
12
- includes:
13
- in_header: "../../sty/galaaz.sty"
14
- keep_tex: yes
15
- number_sections: yes
16
- toc: true
17
- toc_depth: 2
18
- md_document:
19
- variant: markdown_github
20
- fontsize: 11pt
21
- ---
22
-
23
-
24
-
25
- # Introduction
26
-
27
- Galaaz is a system for tightly coupling Ruby and R. Ruby is a powerful language, with
28
- a large community, a very large set of libraries and great for web development. However,
29
- it lacks libraries for data science, statistics, scientific plotting and machine learning.
30
- On the other hand, R is considered one of the most powerful languages for solving all of the
31
- above problems. Maybe the strongest competitor to R is Python with libraries such as NumPy,
32
- Panda, SciPy, SciKit-Learn and a couple more.
33
-
34
- With Galaaz we do not intend to re-implement any of the scientific libraries in R, we allow
35
- for very tight coupling between the two languages to the point that the Ruby developer does
36
- not need to know that there is an R engine running. For this to happen we use new
37
- technologies provided by Oracle: GraalVM, TruffleRuby and FastR:
38
-
39
- GraalVM is a universal virtual machine for running applications
40
- written in JavaScript, Python 3, Ruby, R, JVM-based languages like Java,
41
- Scala, Kotlin, and LLVM-based languages such as C and C++.
42
-
43
- GraalVM removes the isolation between programming languages and enables
44
- interoperability in a shared runtime. It can run either standalone or in
45
- the context of OpenJDK, Node.js, Oracle Database, or MySQL.
46
-
47
- GraalVM allows you to write polyglot applications with a seamless way to
48
- pass values from one language to another. With GraalVM there is no copying
49
- or marshaling necessary as it is with other polyglot systems. This lets
50
- you achieve high performance when language boundaries are crossed. Most
51
- of the time there is no additional cost for crossing a language boundary
52
- at all.
53
-
54
- Often developers have to make uncomfortable compromises that require them
55
- to rewrite their software in other languages. For example:
56
-
57
- * “That library is not available in my language. I need to rewrite it.”
58
- * “That language would be the perfect fit for my problem, but we cannot
59
- run it in our environment.”
60
- * “That problem is already solved in my language, but the language is
61
- too slow.”
62
-
63
- With GraalVM we aim to allow developers to freely choose the right language
64
- for the task at hand without making compromises.
65
-
66
- Interested readers should also check out the following sites:
67
-
68
- * [GraalVM Home](https://www.graalvm.org/)
69
- * [TruffleRuby](https://github.com/oracle/truffleruby)
70
- * [FastR](https://github.com/oracle/fastr)
71
- * [Faster R with FastR](https://medium.com/graalvm/faster-r-with-fastr-4b8db0e0dceb)
72
-
73
- ## What does Galaaz mean
74
-
75
- Galaaz is the Portuguese name for "Galahad". From Wikipedia:
76
-
77
- Sir Galahad (sometimes referred to as Galeas or Galath),
78
- in Arthurian legend, is a knight of King Arthur's Round Table and one
79
- of the three achievers of the Holy Grail. He is the illegitimate son
80
- of Sir Lancelot and Elaine of Corbenic, and is renowned for his
81
- gallantry and purity as the most perfect of all knights. Emerging quite
82
- late in the medieval Arthurian tradition, Sir Galahad first appears in the
83
- Lancelot–Grail cycle, and his story is taken up in later works such as
84
- the Post-Vulgate Cycle and Sir Thomas Malory's Le Morte d'Arthur.
85
- His name should not be mistaken with Galehaut, a different knight from
86
- Arthurian legend.
87
-
88
- # Galaaz Demo
89
-
90
- ## Prerequisites
91
-
92
- * GraalVM (>= rc7)
93
- * TruffleRuby
94
- * FastR
95
-
96
- The following R packages will be automatically installed when necessary, but could be installed prior
97
- to the demo if desired:
98
-
99
- * ggplot2
100
- * gridExtra
101
-
102
- Installation of R packages requires a development environment. In Linux, the gnu compiler and
103
- tools should be enough. I am not sure what is needed on the Mac.
104
-
105
- In order to run the 'specs' the following Ruby package is necessary:
106
-
107
- * gem install rspec
108
-
109
- ## Preparation
110
-
111
- * gem install galaaz
112
-
113
- ## Running the demo
114
-
115
- The ggplot for this demos was extracted from: http://r-statistics.co/Top50-Ggplot2-Visualizations-MasterList-R-Code.html.
116
-
117
- On the console do
118
-
119
- > galaaz master_list:scatter_plot
120
-
121
- ## Running other demos
122
-
123
- Doing on the console
124
-
125
- > galaaz -T
126
-
127
- will show a list with all available demos. To run any of the demos in the list,
128
- substitute the call to
129
- 'rake' to 'galaaz'. For instance, one of the examples in the list is 'rake sthda:bar'.
130
- In order to run
131
- this example just do 'galaaz sthda:bar'. Doing 'galaaz sthda:all' will run all demos in the sthda
132
- cathegory. Some of the examples require 'rspec' do be available. To install 'rspec' just do
133
- 'gem install rspec'.
134
-
135
- # The demo code
136
-
137
-
138
- The following is the Ruby code and plot for the above example. There is a small difference between
139
- the code in the example and the code bellow. If the example is ran, the plot will appear on the
140
- screen, bellow, we generate an 'svg' image and then include it in this document. In order to
141
- generate and image, the R.svg device is used. To generate the plot on the screen, use the R.awt
142
- device, as commented on the code.
143
-
144
-
145
- ```ruby
146
- require 'galaaz'
147
- require 'ggplot'
148
-
149
- # load package and data
150
- R.options(scipen: 999) # turn-off scientific notation like 1e+48
151
- R.theme_set(R.theme_bw) # pre-set the bw theme.
152
-
153
- midwest = ~:midwest
154
- # midwest <- read.csv("http://goo.gl/G1K41K") # bkup data source
155
-
156
- # R.awt # run the awt device if the plot should show on the screen
157
- R.svg # run the svg device if an image should be generated
158
-
159
- # Scatterplot
160
- gg = midwest.ggplot(E.aes(x: :area, y: :poptotal)) +
161
- R.geom_point(E.aes(col: :state, size: :popdensity)) +
162
- R.geom_smooth(method: "loess", se: false) +
163
- R.xlim(R.c(0, 0.1)) +
164
- R.ylim(R.c(0, 500000)) +
165
- R.labs(subtitle: "Area Vs Population",
166
- y: "Population",
167
- x: "Area",
168
- title: "Scatterplot",
169
- caption: "Source: midwest")
170
-
171
- R.png('midwest.png') # this line is not necessary with the awt device
172
- puts gg
173
-
174
- R.dev__off # R.dev__off turns off the device. If using awt, the plot
175
- # window will be closed
176
- ```
177
-
178
- ```
179
- ## This is the fake output
180
- ```
181
-
182
-
183
- ![Midwest Plot](midwest.png){width=70%}
184
-
185
- In R, the code to generate this plot is the following
186
-
187
-
188
- ```r
189
- # install.packages("ggplot2")
190
- # load package and data
191
- options(scipen=999) # turn-off scientific notation like 1e+48
192
- library(ggplot2)
193
- theme_set(theme_bw()) # pre-set the bw theme.
194
- data("midwest", package = "ggplot2")
195
- # midwest <- read.csv("http://goo.gl/G1K41K") # bkup data source
196
-
197
- # Scatterplot
198
- gg <- ggplot(midwest, aes(x=area, y=poptotal)) +
199
- geom_point(aes(col=state, size=popdensity)) +
200
- geom_smooth(method="loess", se=F) +
201
- xlim(c(0, 0.1)) +
202
- ylim(c(0, 500000)) +
203
- labs(subtitle="Area Vs Population",
204
- y="Population",
205
- x="Area",
206
- title="Scatterplot",
207
- caption = "Source: midwest")
208
-
209
- plot(gg)
210
- ```
211
-
212
- Note that both codes are very similar. The Ruby code requires the use of "R." before calling
213
- any functions,
214
- for instance R function 'geom_point' becomes 'R.geom_point' in Ruby. R named parameters such as
215
- (col = state, size = popdensity), become in Ruby (col: :state, size: :popdensity).
216
-
217
- One last
218
- point that needs to be observed is the call to the 'aes' function. In Ruby instead of doing
219
- 'R.aes', we use 'E.aes'. The explanation of why E.aes is needed is an advanced topic in R and
220
- depends on what is know as Non-standard Evaluation (NSE) in R. In short, function 'aes' is lazily
221
- evaluated in R, i.e., in R when calling geom_point(aes(col=state, size=popdensity)), function
222
- geom_point receives as argument something similar to a string containing
223
- 'aes(col=state, size=popdensity)', and the aes function will be evaluated inside the geom_point
224
- function. In Ruby, there is no Lazy evaluation and doing R.aes would try to evaluate aes
225
- immediately. In order to delay the evaluation of function aes we need to use E.aes. The
226
- interested reader on NSE in R is directed to http://adv-r.had.co.nz/Computing-on-the-language.html.
227
-
228
- # An extension to the example
229
-
230
- If both codes are so similar, then why would one use Ruby instead of R and what good is galaaz
231
- after all?
232
-
233
- Ruby is a modern OO language with numerous very useful constructs such as classes, modules, blocks,
234
- procs, etc. The example above focus on the coupling of both languages, and does not show the
235
- use of other Ruby constructs. In the following example, we will show a more complex example using
236
- other Ruby constructs. This is certainly not a very well written and robust Ruby code, but
237
- it give the idea of how Ruby and R are strongly coupled.
238
-
239
- Let's imagine that we work in a corporation that has its plot themes. So, it has defined a
240
- 'CorpTheme' module. Plots in this corporation should not have grids, numbers in labels should
241
- not use scientific notation and the preferred color is blue.
242
-
243
-
244
- ```ruby
245
- # corp_theme.rb
246
- # defines the corporate theme for all plots
247
-
248
- module CorpTheme
249
-
250
- #---------------------------------------------------------------------------------
251
- # Defines the plot theme (visualization). In this theme we remove major and minor
252
- # grids, borders and background. We also turn-off scientific notation.
253
- #---------------------------------------------------------------------------------
254
-
255
- def self.global_theme
256
-
257
- R.options(scipen: 999) # turn-off scientific notation like 1e+48
258
-
259
- # remove major grids
260
- global_theme = R.theme(panel__grid__major: E.element_blank())
261
- # remove minor grids
262
- global_theme = global_theme + R.theme(panel__grid__minor: E.element_blank)
263
- # remove border
264
- global_theme = global_theme + R.theme(panel__border: E.element_blank)
265
- # remove background
266
- global_theme = global_theme + R.theme(panel__background: E.element_blank)
267
- # Change axis font
268
- global_theme = global_theme +
269
- R.theme(axis__text: E.element_text(size: 8, color: "#000080"))
270
- # change color of axis titles
271
- global_theme = global_theme +
272
- R.theme(axis__title: E.element_text(
273
- color: "#000080",
274
- face: "bold",
275
- size: 8,
276
- hjust: 1))
277
- end
278
-
279
- end
280
- ```
281
-
282
- ```
283
- ## This is the fake output
284
- ```
285
-
286
- We now define a ScatterPlot class:
287
-
288
-
289
- ```ruby
290
- # ScatterPlot.rb
291
- # creates a scatter plot and allow some configuration
292
-
293
- class ScatterPlot
294
-
295
- attr_accessor :title
296
- attr_accessor :subtitle
297
- attr_accessor :caption
298
- attr_accessor :x_label
299
- attr_accessor :y_label
300
-
301
- #---------------------------------------------------------------------------------
302
- # Initialize the plot with the data and the x and y variables
303
- #---------------------------------------------------------------------------------
304
-
305
- def initialize(data, x:, y:)
306
- @data = data
307
- @x = x
308
- @y = y
309
- end
310
-
311
- #---------------------------------------------------------------------------------
312
- # Define groupings by color and size
313
- #---------------------------------------------------------------------------------
314
-
315
- def group_by(color:, size:)
316
- @color_by = color
317
- @size_by = size
318
- end
319
-
320
- #---------------------------------------------------------------------------------
321
- # Add a smoothing line, and if confidence is true the add a confidence interval, if
322
- # false does not add the confidence interval
323
- #---------------------------------------------------------------------------------
324
-
325
- def add_smoothing_line(method:, confidence: true)
326
- @method = method
327
- @confidence = confidence
328
- end
329
-
330
- #---------------------------------------------------------------------------------
331
- # Creates the graph title, properly formated for this theme
332
- # @param title [String] The title to add to the graph
333
- # @return textGrob that can be included in a graph
334
- #---------------------------------------------------------------------------------
335
-
336
- def graph_params(title: "", subtitle: "", caption: "", x_label: "", y_label: "")
337
- R.labs(
338
- title: title,
339
- subtitle: subtitle,
340
- caption: caption,
341
- y_label: y_label,
342
- x_label: x_label,
343
- )
344
- end
345
-
346
- #---------------------------------------------------------------------------------
347
- # Prepare the plot's points
348
- #---------------------------------------------------------------------------------
349
-
350
- def points
351
- params = {}
352
- params[:col] = @color_by if @color_by
353
- params[:size] = @size_by if @size_by
354
- R.geom_point(E.aes(params))
355
- end
356
-
357
- #---------------------------------------------------------------------------------
358
- # Plots the scatterplot
359
- #---------------------------------------------------------------------------------
360
-
361
- def plot(device = 'awt')
362
- device == 'awt' ? R.awt : R.svg
363
-
364
- gg = @data.ggplot(E.aes(x: @x, y: @y)) +
365
- points +
366
- R.geom_smooth(method: @method, se: @confidence) +
367
- R.xlim(R.c(0, 0.1)) +
368
- R.ylim(R.c(0, 500000)) +
369
- graph_params(title: @title,
370
- subtitle: @subtitle,
371
- y_label: @y_label,
372
- x_label: @x_label,
373
- caption: @caption) +
374
- CorpTheme.global_theme
375
-
376
- R.png('scatter_plot.png') if !(device == 'awt')
377
- puts gg
378
- R.dev__off
379
-
380
- end
381
-
382
- end
383
- ```
384
-
385
- ```
386
- ## This is the fake output
387
- ```
388
-
389
- And this is the final code for making the scatter plot with the midwest data
390
-
391
-
392
- ```ruby
393
- require 'galaaz'
394
- require 'ggplot'
395
-
396
- sp = ScatterPlot.new(~:midwest, x: :area, y: :poptotal)
397
- sp.title = "Midwest Dataset - Scatterplot"
398
- sp.subtitle = "Area Vs Population"
399
- sp.caption = "Source: midwest"
400
- sp.x_label = "Area"
401
- sp.y_label = "Population"
402
- sp.group_by(color: :state, size: :popdensity) # try sp.group_by(color: :state)
403
- # available methods: "lm", "glm", "loess", "gam"
404
- sp.add_smoothing_line(method: "glm")
405
- # sp.plot('svg')
406
- puts sp
407
- ```
408
-
409
- ```
410
- ## This is the fake output
411
- ```
412
-
413
- ![Midwest Plot with 'glm' function and modified theme](scatter_plot.png){width=70%}
414
-
415
- # Conclusion
416
-
417
- R is a very powerful language for statistical analysis, data analytics, machine learning, plotting
418
- and many other scientific applications with a very large package ecosystem. However R is often
419
- considered hard to learn and lacking modern computer languages constructs such as object oriented
420
- classes, modules, lambdas, etc. For this reason, many developers have started or switched from R
421
- to Python.
422
-
423
- With Galaaz, R programmers can almost transparently migrate from R to Ruby, since syntax is
424
- almost identical and they have fastR as the R engine. FastR, by most benchmarks, can be orders of
425
- magnitude faster than Gnu R. Further, by using Galaaz the R developer can start (slowly if needed)
426
- using all of Ruby's constructs and libraries that nicely complement R packages.
427
-
428
- For the Ruby developer, Galaaz allows the immediate use of R functions completely transparently. As
429
- shown in the second example above, class ScatterPlot completely hides all the details an R calls
430
- from the Ruby developer, furthermore Galaaz is powered by TruffleRuby that can also be orders of
431
- magnitude faster than MRI Ruby.
@@ -1,481 +0,0 @@
1
- \documentclass[11pt,]{article}
2
- \usepackage{lmodern}
3
- \usepackage{amssymb,amsmath}
4
- \usepackage{ifxetex,ifluatex}
5
- \usepackage{fixltx2e} % provides \textsubscript
6
- \ifnum 0\ifxetex 1\fi\ifluatex 1\fi=0 % if pdftex
7
- \usepackage[T1]{fontenc}
8
- \usepackage[utf8]{inputenc}
9
- \else % if luatex or xelatex
10
- \ifxetex
11
- \usepackage{mathspec}
12
- \else
13
- \usepackage{fontspec}
14
- \fi
15
- \defaultfontfeatures{Ligatures=TeX,Scale=MatchLowercase}
16
- \fi
17
- % use upquote if available, for straight quotes in verbatim environments
18
- \IfFileExists{upquote.sty}{\usepackage{upquote}}{}
19
- % use microtype if available
20
- \IfFileExists{microtype.sty}{%
21
- \usepackage{microtype}
22
- \UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts
23
- }{}
24
- \usepackage[margin=1in]{geometry}
25
- \usepackage{hyperref}
26
- \hypersetup{unicode=true,
27
- pdftitle={Ruby Plotting with Galaaz},
28
- pdfauthor={Rodrigo Botafogo},
29
- pdfborder={0 0 0},
30
- breaklinks=true}
31
- \urlstyle{same} % don't use monospace font for urls
32
- \usepackage{color}
33
- \usepackage{fancyvrb}
34
- \newcommand{\VerbBar}{|}
35
- \newcommand{\VERB}{\Verb[commandchars=\\\{\}]}
36
- \DefineVerbatimEnvironment{Highlighting}{Verbatim}{commandchars=\\\{\}}
37
- % Add ',fontsize=\small' for more characters per line
38
- \usepackage{framed}
39
- \definecolor{shadecolor}{RGB}{248,248,248}
40
- \newenvironment{Shaded}{\begin{snugshade}}{\end{snugshade}}
41
- \newcommand{\AlertTok}[1]{\textcolor[rgb]{0.94,0.16,0.16}{#1}}
42
- \newcommand{\AnnotationTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{\textbf{\textit{#1}}}}
43
- \newcommand{\AttributeTok}[1]{\textcolor[rgb]{0.77,0.63,0.00}{#1}}
44
- \newcommand{\BaseNTok}[1]{\textcolor[rgb]{0.00,0.00,0.81}{#1}}
45
- \newcommand{\BuiltInTok}[1]{#1}
46
- \newcommand{\CharTok}[1]{\textcolor[rgb]{0.31,0.60,0.02}{#1}}
47
- \newcommand{\CommentTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{\textit{#1}}}
48
- \newcommand{\CommentVarTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{\textbf{\textit{#1}}}}
49
- \newcommand{\ConstantTok}[1]{\textcolor[rgb]{0.00,0.00,0.00}{#1}}
50
- \newcommand{\ControlFlowTok}[1]{\textcolor[rgb]{0.13,0.29,0.53}{\textbf{#1}}}
51
- \newcommand{\DataTypeTok}[1]{\textcolor[rgb]{0.13,0.29,0.53}{#1}}
52
- \newcommand{\DecValTok}[1]{\textcolor[rgb]{0.00,0.00,0.81}{#1}}
53
- \newcommand{\DocumentationTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{\textbf{\textit{#1}}}}
54
- \newcommand{\ErrorTok}[1]{\textcolor[rgb]{0.64,0.00,0.00}{\textbf{#1}}}
55
- \newcommand{\ExtensionTok}[1]{#1}
56
- \newcommand{\FloatTok}[1]{\textcolor[rgb]{0.00,0.00,0.81}{#1}}
57
- \newcommand{\FunctionTok}[1]{\textcolor[rgb]{0.00,0.00,0.00}{#1}}
58
- \newcommand{\ImportTok}[1]{#1}
59
- \newcommand{\InformationTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{\textbf{\textit{#1}}}}
60
- \newcommand{\KeywordTok}[1]{\textcolor[rgb]{0.13,0.29,0.53}{\textbf{#1}}}
61
- \newcommand{\NormalTok}[1]{#1}
62
- \newcommand{\OperatorTok}[1]{\textcolor[rgb]{0.81,0.36,0.00}{\textbf{#1}}}
63
- \newcommand{\OtherTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{#1}}
64
- \newcommand{\PreprocessorTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{\textit{#1}}}
65
- \newcommand{\RegionMarkerTok}[1]{#1}
66
- \newcommand{\SpecialCharTok}[1]{\textcolor[rgb]{0.00,0.00,0.00}{#1}}
67
- \newcommand{\SpecialStringTok}[1]{\textcolor[rgb]{0.31,0.60,0.02}{#1}}
68
- \newcommand{\StringTok}[1]{\textcolor[rgb]{0.31,0.60,0.02}{#1}}
69
- \newcommand{\VariableTok}[1]{\textcolor[rgb]{0.00,0.00,0.00}{#1}}
70
- \newcommand{\VerbatimStringTok}[1]{\textcolor[rgb]{0.31,0.60,0.02}{#1}}
71
- \newcommand{\WarningTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{\textbf{\textit{#1}}}}
72
- \usepackage{graphicx,grffile}
73
- \makeatletter
74
- \def\maxwidth{\ifdim\Gin@nat@width>\linewidth\linewidth\else\Gin@nat@width\fi}
75
- \def\maxheight{\ifdim\Gin@nat@height>\textheight\textheight\else\Gin@nat@height\fi}
76
- \makeatother
77
- % Scale images if necessary, so that they will not overflow the page
78
- % margins by default, and it is still possible to overwrite the defaults
79
- % using explicit options in \includegraphics[width, height, ...]{}
80
- \setkeys{Gin}{width=\maxwidth,height=\maxheight,keepaspectratio}
81
- \IfFileExists{parskip.sty}{%
82
- \usepackage{parskip}
83
- }{% else
84
- \setlength{\parindent}{0pt}
85
- \setlength{\parskip}{6pt plus 2pt minus 1pt}
86
- }
87
- \setlength{\emergencystretch}{3em} % prevent overfull lines
88
- \providecommand{\tightlist}{%
89
- \setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}
90
- \setcounter{secnumdepth}{5}
91
- % Redefines (sub)paragraphs to behave more like sections
92
- \ifx\paragraph\undefined\else
93
- \let\oldparagraph\paragraph
94
- \renewcommand{\paragraph}[1]{\oldparagraph{#1}\mbox{}}
95
- \fi
96
- \ifx\subparagraph\undefined\else
97
- \let\oldsubparagraph\subparagraph
98
- \renewcommand{\subparagraph}[1]{\oldsubparagraph{#1}\mbox{}}
99
- \fi
100
-
101
- %%% Use protect on footnotes to avoid problems with footnotes in titles
102
- \let\rmarkdownfootnote\footnote%
103
- \def\footnote{\protect\rmarkdownfootnote}
104
-
105
- %%% Change title format to be more compact
106
- \usepackage{titling}
107
-
108
- % Create subtitle command for use in maketitle
109
- \newcommand{\subtitle}[1]{
110
- \posttitle{
111
- \begin{center}\large#1\end{center}
112
- }
113
- }
114
-
115
- \setlength{\droptitle}{-2em}
116
-
117
- \title{Ruby Plotting with Galaaz}
118
- \pretitle{\vspace{\droptitle}\centering\huge}
119
- \posttitle{\par}
120
- \subtitle{An example of tightly coupling Ruby and R in GraalVM}
121
- \author{Rodrigo Botafogo}
122
- \preauthor{\centering\large\emph}
123
- \postauthor{\par}
124
- \predate{\centering\large\emph}
125
- \postdate{\par}
126
- \date{16 October 2018}
127
-
128
- % usar portugues do Brasil
129
- % \usepackage[brazilian]{babel}
130
- \usepackage[utf8]{inputenc}
131
-
132
- \usepackage{geometry}
133
- \geometry{a4paper, top=1in}
134
-
135
- % needed for kableExtra
136
- \usepackage{longtable}
137
- \usepackage{multirow}
138
- \usepackage[table]{xcolor}
139
- \usepackage{wrapfig}
140
- \usepackage{float}
141
- \usepackage{colortbl}
142
- \usepackage{pdflscape}
143
- \usepackage{tabu}
144
- \usepackage{threeparttable}
145
- \usepackage[normalem]{ulem}
146
-
147
- \usepackage{bbm}
148
- \usepackage{booktabs}
149
- \usepackage{expex}
150
-
151
- \usepackage{graphicx}
152
-
153
- \usepackage{fancyhdr}
154
- % set the header and foot style
155
- % style 'fancy' adds the section name on the header
156
- % and the page number on the footer
157
- \pagestyle{fancy}
158
-
159
- % style 'fancyhf' leaves header and footer empty
160
- %\fancyhf{}
161
-
162
- % sets the left head element to \rightmark, which contains the
163
- % current section (\leftmark is the current chapter)
164
- %\fancyhead[L]{\rightmark} .
165
-
166
- % sets the right head element to the page number.
167
- % \fancyhead[R]{\thepage}
168
-
169
- % lets the head rule disappear.
170
- % \renewcommand{\headrulewidth}{0pt}
171
- % Possible selectors for the optional argument of \fancyhead/\fancyfoot
172
- % are L (left), C (center) or R (right) for the position of the element
173
- % and E (even) or O (odd) to distinguish even and odd pages. If you omit
174
- % E/O the element is set for all pages.
175
-
176
- % \usepackage{lipsum}
177
-
178
- % make available command lastpage
179
- \usepackage{lastpage}
180
-
181
- % default fontsize 11pt better to add
182
- % fontsize on the yaml header
183
- % \usepackage[fontsize=11pt]{scrextend}
184
-
185
- % comandos para formatar uma tabela
186
- \usepackage{array}
187
- \newcolumntype{L}[1]{>{\raggedright\let\newline\\\arraybackslash\hspace{0pt}}m{#1}}
188
- \newcolumntype{C}[1]{>{\centering\let\newline\\\arraybackslash\hspace{0pt}}m{#1}}
189
- \newcolumntype{R}[1]{>{\raggedleft\let\newline\\\arraybackslash\hspace{0pt}}m{#1}}
190
-
191
- % necessário if we need to import other latex documents
192
- \usepackage{import}
193
-
194
- % Command to import an R variable to latex
195
- \newcommand{\RtoLatex}[2]{\newcommand{#1}{#2}}
196
-
197
- %
198
- %\newcommand{\atraso}[1]{\color{red} \textbf {Tempo desde a Assinatura do Contrato: #1 dias}}
199
-
200
- \begin{document}
201
- \maketitle
202
-
203
- {
204
- \setcounter{tocdepth}{2}
205
- \tableofcontents
206
- }
207
- \hypertarget{introduction}{%
208
- \section{Introduction}\label{introduction}}
209
-
210
- Galaaz is a system for tightly coupling Ruby and R. Ruby is a powerful
211
- language, with a large community, a very large set of libraries and
212
- great for web development. However, it lacks libraries for data science,
213
- statistics, scientific plotting and machine learning. On the other hand,
214
- R is considered one of the most powerful languages for solving all of
215
- the above problems. Maybe the strongest competitor to R is Python with
216
- libraries such as NumPy, Panda, SciPy, SciKit-Learn and a couple more.
217
-
218
- With Galaaz we do not intend to re-implement any of the scientific
219
- libraries in R, we allow for very tight coupling between the two
220
- languages to the point that the Ruby developer does not need to know
221
- that there is an R engine running. For this to happen we use new
222
- technologies provided by Oracle: GraalVM, TruffleRuby and FastR:
223
-
224
- \begin{verbatim}
225
- GraalVM is a universal virtual machine for running applications
226
- written in JavaScript, Python 3, Ruby, R, JVM-based languages like Java,
227
- Scala, Kotlin, and LLVM-based languages such as C and C++.
228
-
229
- GraalVM removes the isolation between programming languages and enables
230
- interoperability in a shared runtime. It can run either standalone or in
231
- the context of OpenJDK, Node.js, Oracle Database, or MySQL.
232
-
233
- GraalVM allows you to write polyglot applications with a seamless way to
234
- pass values from one language to another. With GraalVM there is no copying
235
- or marshaling necessary as it is with other polyglot systems. This lets
236
- you achieve high performance when language boundaries are crossed. Most
237
- of the time there is no additional cost for crossing a language boundary
238
- at all.
239
-
240
- Often developers have to make uncomfortable compromises that require them
241
- to rewrite their software in other languages. For example:
242
-
243
- * “That library is not available in my language. I need to rewrite it.”
244
- * “That language would be the perfect fit for my problem, but we cannot
245
- run it in our environment.”
246
- * “That problem is already solved in my language, but the language is
247
- too slow.”
248
-
249
- With GraalVM we aim to allow developers to freely choose the right language
250
- for the task at hand without making compromises.
251
- \end{verbatim}
252
-
253
- Interested readers should also check out the following sites:
254
-
255
- \begin{itemize}
256
- \tightlist
257
- \item
258
- \href{https://www.graalvm.org/}{GraalVM Home}
259
- \item
260
- \href{https://github.com/oracle/truffleruby}{TruffleRuby}
261
- \item
262
- \href{https://github.com/oracle/fastr}{FastR}
263
- \item
264
- \href{https://medium.com/graalvm/faster-r-with-fastr-4b8db0e0dceb}{Faster
265
- R with FastR}
266
- \end{itemize}
267
-
268
- \hypertarget{what-does-galaaz-mean}{%
269
- \subsection{What does Galaaz mean}\label{what-does-galaaz-mean}}
270
-
271
- Galaaz is the Portuguese name for ``Galahad''. From Wikipedia:
272
-
273
- \begin{verbatim}
274
- Sir Galahad (sometimes referred to as Galeas or Galath),
275
- in Arthurian legend, is a knight of King Arthur's Round Table and one
276
- of the three achievers of the Holy Grail. He is the illegitimate son
277
- of Sir Lancelot and Elaine of Corbenic, and is renowned for his
278
- gallantry and purity as the most perfect of all knights. Emerging quite
279
- late in the medieval Arthurian tradition, Sir Galahad first appears in the
280
- Lancelot–Grail cycle, and his story is taken up in later works such as
281
- the Post-Vulgate Cycle and Sir Thomas Malory's Le Morte d'Arthur.
282
- His name should not be mistaken with Galehaut, a different knight from
283
- Arthurian legend.
284
- \end{verbatim}
285
-
286
- \hypertarget{galaaz-demo}{%
287
- \section{Galaaz Demo}\label{galaaz-demo}}
288
-
289
- \hypertarget{prerequisites}{%
290
- \subsection{Prerequisites}\label{prerequisites}}
291
-
292
- \begin{itemize}
293
- \tightlist
294
- \item
295
- GraalVM (\textgreater{}= rc7)
296
- \item
297
- TruffleRuby
298
- \item
299
- FastR
300
- \end{itemize}
301
-
302
- The following R packages will be automatically installed when necessary,
303
- but could be installed prior to the demo if desired:
304
-
305
- \begin{itemize}
306
- \tightlist
307
- \item
308
- ggplot2
309
- \item
310
- gridExtra
311
- \end{itemize}
312
-
313
- Installation of R packages requires a development environment. In Linux,
314
- the gnu compiler and tools should be enough. I am not sure what is
315
- needed on the Mac.
316
-
317
- In order to run the `specs' the following Ruby package is necessary:
318
-
319
- \begin{itemize}
320
- \tightlist
321
- \item
322
- gem install rspec
323
- \end{itemize}
324
-
325
- \hypertarget{preparation}{%
326
- \subsection{Preparation}\label{preparation}}
327
-
328
- \begin{itemize}
329
- \tightlist
330
- \item
331
- gem install galaaz
332
- \end{itemize}
333
-
334
- \hypertarget{running-the-demo}{%
335
- \subsection{Running the demo}\label{running-the-demo}}
336
-
337
- The ggplot for this demos was extracted from:
338
- \url{http://r-statistics.co/Top50-Ggplot2-Visualizations-MasterList-R-Code.html}.
339
-
340
- On the console do
341
-
342
- \begin{verbatim}
343
- > galaaz master_list:scatter_plot
344
- \end{verbatim}
345
-
346
- \hypertarget{running-other-demos}{%
347
- \subsection{Running other demos}\label{running-other-demos}}
348
-
349
- Doing on the console
350
-
351
- \begin{verbatim}
352
- > galaaz -T
353
- \end{verbatim}
354
-
355
- will show a list with all available demos. To run any of the demos in
356
- the list, substitute the call to `rake' to `galaaz'. For instance, one
357
- of the examples in the list is `rake sthda:bar'. In order to run this
358
- example just do `galaaz sthda:bar'. Doing `galaaz sthda:all' will run
359
- all demos in the sthda cathegory. Some of the examples require `rspec'
360
- do be available. To install `rspec' just do `gem install rspec'.
361
-
362
- \hypertarget{the-demo-code}{%
363
- \section{The demo code}\label{the-demo-code}}
364
-
365
- The following is the Ruby code and plot for the above example. There is
366
- a small difference between the code in the example and the code bellow.
367
- If the example is ran, the plot will appear on the screen, bellow, we
368
- generate an `svg' image and then include it in this document. In order
369
- to generate and image, the R.svg device is used. To generate the plot on
370
- the screen, use the R.awt device, as commented on the code.
371
-
372
- \begin{figure}
373
- \centering
374
- \includegraphics[width=0.7\textwidth,height=\textheight]{midwest.png}
375
- \caption{Midwest Plot}
376
- \end{figure}
377
-
378
- In R, the code to generate this plot is the following
379
-
380
- \begin{Shaded}
381
- \begin{Highlighting}[]
382
- \CommentTok{# install.packages("ggplot2")}
383
- \CommentTok{# load package and data}
384
- \KeywordTok{options}\NormalTok{(}\DataTypeTok{scipen=}\DecValTok{999}\NormalTok{) }\CommentTok{# turn-off scientific notation like 1e+48}
385
- \KeywordTok{library}\NormalTok{(ggplot2)}
386
- \KeywordTok{theme_set}\NormalTok{(}\KeywordTok{theme_bw}\NormalTok{()) }\CommentTok{# pre-set the bw theme.}
387
- \KeywordTok{data}\NormalTok{(}\StringTok{"midwest"}\NormalTok{, }\DataTypeTok{package =} \StringTok{"ggplot2"}\NormalTok{)}
388
- \CommentTok{# midwest <- read.csv("http://goo.gl/G1K41K") # bkup data source}
389
-
390
- \CommentTok{# Scatterplot}
391
- \NormalTok{gg <-}\StringTok{ }\KeywordTok{ggplot}\NormalTok{(midwest, }\KeywordTok{aes}\NormalTok{(}\DataTypeTok{x=}\NormalTok{area, }\DataTypeTok{y=}\NormalTok{poptotal)) }\OperatorTok{+}\StringTok{ }
392
- \StringTok{ }\KeywordTok{geom_point}\NormalTok{(}\KeywordTok{aes}\NormalTok{(}\DataTypeTok{col=}\NormalTok{state, }\DataTypeTok{size=}\NormalTok{popdensity)) }\OperatorTok{+}\StringTok{ }
393
- \StringTok{ }\KeywordTok{geom_smooth}\NormalTok{(}\DataTypeTok{method=}\StringTok{"loess"}\NormalTok{, }\DataTypeTok{se=}\NormalTok{F) }\OperatorTok{+}\StringTok{ }
394
- \StringTok{ }\KeywordTok{xlim}\NormalTok{(}\KeywordTok{c}\NormalTok{(}\DecValTok{0}\NormalTok{, }\FloatTok{0.1}\NormalTok{)) }\OperatorTok{+}\StringTok{ }
395
- \StringTok{ }\KeywordTok{ylim}\NormalTok{(}\KeywordTok{c}\NormalTok{(}\DecValTok{0}\NormalTok{, }\DecValTok{500000}\NormalTok{)) }\OperatorTok{+}\StringTok{ }
396
- \StringTok{ }\KeywordTok{labs}\NormalTok{(}\DataTypeTok{subtitle=}\StringTok{"Area Vs Population"}\NormalTok{, }
397
- \DataTypeTok{y=}\StringTok{"Population"}\NormalTok{, }
398
- \DataTypeTok{x=}\StringTok{"Area"}\NormalTok{, }
399
- \DataTypeTok{title=}\StringTok{"Scatterplot"}\NormalTok{, }
400
- \DataTypeTok{caption =} \StringTok{"Source: midwest"}\NormalTok{)}
401
-
402
- \KeywordTok{plot}\NormalTok{(gg)}
403
- \end{Highlighting}
404
- \end{Shaded}
405
-
406
- Note that both codes are very similar. The Ruby code requires the use of
407
- ``R.'' before calling any functions, for instance R function
408
- `geom\_point' becomes `R.geom\_point' in Ruby. R named parameters such
409
- as (col = state, size = popdensity), become in Ruby (col: :state, size:
410
- :popdensity).
411
-
412
- One last point that needs to be observed is the call to the `aes'
413
- function. In Ruby instead of doing `R.aes', we use `E.aes'. The
414
- explanation of why E.aes is needed is an advanced topic in R and depends
415
- on what is know as Non-standard Evaluation (NSE) in R. In short,
416
- function `aes' is lazily evaluated in R, i.e., in R when calling
417
- geom\_point(aes(col=state, size=popdensity)), function geom\_point
418
- receives as argument something similar to a string containing
419
- `aes(col=state, size=popdensity)', and the aes function will be
420
- evaluated inside the geom\_point function. In Ruby, there is no Lazy
421
- evaluation and doing R.aes would try to evaluate aes immediately. In
422
- order to delay the evaluation of function aes we need to use E.aes. The
423
- interested reader on NSE in R is directed to
424
- \url{http://adv-r.had.co.nz/Computing-on-the-language.html}.
425
-
426
- \hypertarget{an-extension-to-the-example}{%
427
- \section{An extension to the
428
- example}\label{an-extension-to-the-example}}
429
-
430
- If both codes are so similar, then why would one use Ruby instead of R
431
- and what good is galaaz after all?
432
-
433
- Ruby is a modern OO language with numerous very useful constructs such
434
- as classes, modules, blocks, procs, etc. The example above focus on the
435
- coupling of both languages, and does not show the use of other Ruby
436
- constructs. In the following example, we will show a more complex
437
- example using other Ruby constructs. This is certainly not a very well
438
- written and robust Ruby code, but it give the idea of how Ruby and R are
439
- strongly coupled.
440
-
441
- Let's imagine that we work in a corporation that has its plot themes.
442
- So, it has defined a `CorpTheme' module. Plots in this corporation
443
- should not have grids, numbers in labels should not use scientific
444
- notation and the preferred color is blue.
445
-
446
- We now define a ScatterPlot class:
447
-
448
- And this is the final code for making the scatter plot with the midwest
449
- data
450
-
451
- \begin{figure}
452
- \centering
453
- \includegraphics[width=0.7\textwidth,height=\textheight]{scatter_plot.png}
454
- \caption{Midwest Plot with `glm' function and modified theme}
455
- \end{figure}
456
-
457
- \hypertarget{conclusion}{%
458
- \section{Conclusion}\label{conclusion}}
459
-
460
- R is a very powerful language for statistical analysis, data analytics,
461
- machine learning, plotting and many other scientific applications with a
462
- very large package ecosystem. However R is often considered hard to
463
- learn and lacking modern computer languages constructs such as object
464
- oriented classes, modules, lambdas, etc. For this reason, many
465
- developers have started or switched from R to Python.
466
-
467
- With Galaaz, R programmers can almost transparently migrate from R to
468
- Ruby, since syntax is almost identical and they have fastR as the R
469
- engine. FastR, by most benchmarks, can be orders of magnitude faster
470
- than Gnu R. Further, by using Galaaz the R developer can start (slowly
471
- if needed) using all of Ruby's constructs and libraries that nicely
472
- complement R packages.
473
-
474
- For the Ruby developer, Galaaz allows the immediate use of R functions
475
- completely transparently. As shown in the second example above, class
476
- ScatterPlot completely hides all the details an R calls from the Ruby
477
- developer, furthermore Galaaz is powered by TruffleRuby that can also be
478
- orders of magnitude faster than MRI Ruby.
479
-
480
-
481
- \end{document}