galaaz 0.4.10 → 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (163) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +2048 -531
  3. data/Rakefile +3 -2
  4. data/bin/gknit +152 -6
  5. data/bin/gknit-draft +105 -0
  6. data/bin/gknit-draft.rb +28 -0
  7. data/bin/gknit_Rscript +127 -0
  8. data/bin/grun +27 -1
  9. data/bin/gstudio +47 -4
  10. data/bin/{gstudio.rb → gstudio_irb.rb} +0 -0
  11. data/bin/gstudio_pry.rb +7 -0
  12. data/blogs/galaaz_ggplot/galaaz_ggplot.html +10 -195
  13. data/blogs/galaaz_ggplot/galaaz_ggplot.md +404 -0
  14. data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-html/midwest_rb.png +0 -0
  15. data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-html/scatter_plot_rb.png +0 -0
  16. data/blogs/gknit/gknit.Rmd +5 -3
  17. data/blogs/gknit/gknit.pdf +0 -0
  18. data/blogs/gknit/lst.rds +0 -0
  19. data/blogs/manual/lst.rds +0 -0
  20. data/blogs/manual/manual.Rmd +826 -53
  21. data/blogs/manual/manual.html +2338 -695
  22. data/blogs/manual/manual.md +2032 -539
  23. data/blogs/manual/manual.pdf +0 -0
  24. data/blogs/manual/manual.tex +1804 -594
  25. data/blogs/manual/manual_files/figure-html/bubble-1.png +0 -0
  26. data/blogs/manual/manual_files/figure-html/diverging_bar.png +0 -0
  27. data/blogs/manual/manual_files/figure-latex/bubble-1.png +0 -0
  28. data/blogs/manual/manual_files/figure-latex/diverging_bar.pdf +0 -0
  29. data/blogs/manual/model.rb +41 -0
  30. data/blogs/nse_dplyr/nse_dplyr.Rmd +226 -73
  31. data/blogs/nse_dplyr/nse_dplyr.html +254 -336
  32. data/blogs/nse_dplyr/nse_dplyr.md +353 -158
  33. data/blogs/oh_my/oh_my.html +274 -386
  34. data/blogs/oh_my/oh_my.md +208 -205
  35. data/blogs/ruby_plot/ruby_plot.html +20 -205
  36. data/blogs/ruby_plot/ruby_plot.md +14 -15
  37. data/blogs/ruby_plot/ruby_plot_files/figure-html/dose_len.png +0 -0
  38. data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_delivery.png +0 -0
  39. data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_dose.png +0 -0
  40. data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color.png +0 -0
  41. data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color2.png +0 -0
  42. data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_decorations.png +0 -0
  43. data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_jitter.png +0 -0
  44. data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_points.png +0 -0
  45. data/blogs/ruby_plot/ruby_plot_files/figure-html/final_box_plot.png +0 -0
  46. data/blogs/ruby_plot/ruby_plot_files/figure-html/final_violin_plot.png +0 -0
  47. data/blogs/ruby_plot/ruby_plot_files/figure-html/violin_with_jitter.png +0 -0
  48. data/examples/Bibliography/master.bib +50 -0
  49. data/examples/Bibliography/stats.bib +72 -0
  50. data/examples/islr/x_y_rnorm.jpg +0 -0
  51. data/examples/latex_templates/Test-acm_article/Makefile +16 -0
  52. data/examples/latex_templates/Test-acm_article/Test-acm_article.Rmd +65 -0
  53. data/examples/latex_templates/Test-acm_article/acm_proc_article-sp.cls +1670 -0
  54. data/examples/latex_templates/Test-acm_article/sensys-abstract.cls +703 -0
  55. data/examples/latex_templates/Test-acm_article/sigproc.bib +59 -0
  56. data/examples/latex_templates/Test-acs_article/Test-acs_article.Rmd +260 -0
  57. data/examples/latex_templates/Test-acs_article/Test-acs_article.pdf +0 -0
  58. data/examples/latex_templates/Test-acs_article/acs-Test-acs_article.bib +11 -0
  59. data/examples/latex_templates/Test-acs_article/acs-my_output.bib +11 -0
  60. data/examples/latex_templates/Test-acs_article/acstest.bib +17 -0
  61. data/examples/latex_templates/Test-aea_article/AEA.cls +1414 -0
  62. data/{blogs/gknit/marshal.dump → examples/latex_templates/Test-aea_article/BibFile.bib} +0 -0
  63. data/examples/latex_templates/Test-aea_article/Test-aea_article.Rmd +108 -0
  64. data/examples/latex_templates/Test-aea_article/Test-aea_article.pdf +0 -0
  65. data/examples/latex_templates/Test-aea_article/aea.bst +1269 -0
  66. data/examples/latex_templates/Test-aea_article/multicol.sty +853 -0
  67. data/examples/latex_templates/Test-aea_article/references.bib +0 -0
  68. data/examples/latex_templates/Test-aea_article/setspace.sty +546 -0
  69. data/examples/latex_templates/Test-amq_article/Test-amq_article.Rmd +256 -0
  70. data/examples/latex_templates/Test-amq_article/Test-amq_article.pdf +0 -0
  71. data/examples/latex_templates/Test-amq_article/Test-amq_article.pdfsync +3397 -0
  72. data/examples/latex_templates/Test-amq_article/pics/Figure2.pdf +0 -0
  73. data/examples/latex_templates/Test-ams_article/Test-ams_article.Rmd +215 -0
  74. data/examples/latex_templates/Test-ams_article/amstest.bib +436 -0
  75. data/examples/latex_templates/Test-asa_article/Test-asa_article.Rmd +153 -0
  76. data/examples/latex_templates/Test-asa_article/Test-asa_article.pdf +0 -0
  77. data/examples/latex_templates/Test-asa_article/agsm.bst +1353 -0
  78. data/examples/latex_templates/Test-asa_article/bibliography.bib +233 -0
  79. data/examples/latex_templates/Test-ieee_article/IEEEtran.bst +2409 -0
  80. data/examples/latex_templates/Test-ieee_article/IEEEtran.cls +6346 -0
  81. data/examples/latex_templates/Test-ieee_article/Test-ieee_article.Rmd +175 -0
  82. data/examples/latex_templates/Test-ieee_article/Test-ieee_article.pdf +0 -0
  83. data/examples/latex_templates/Test-ieee_article/mybibfile.bib +20 -0
  84. data/examples/latex_templates/Test-rjournal_article/RJournal.sty +335 -0
  85. data/examples/latex_templates/Test-rjournal_article/RJreferences.bib +18 -0
  86. data/examples/latex_templates/Test-rjournal_article/RJwrapper.pdf +0 -0
  87. data/examples/latex_templates/Test-rjournal_article/Test-rjournal_article.Rmd +52 -0
  88. data/examples/latex_templates/Test-springer_article/Test-springer_article.Rmd +65 -0
  89. data/examples/latex_templates/Test-springer_article/Test-springer_article.pdf +0 -0
  90. data/examples/latex_templates/Test-springer_article/bibliography.bib +26 -0
  91. data/examples/latex_templates/Test-springer_article/spbasic.bst +1658 -0
  92. data/examples/latex_templates/Test-springer_article/spmpsci.bst +1512 -0
  93. data/examples/latex_templates/Test-springer_article/spphys.bst +1443 -0
  94. data/examples/latex_templates/Test-springer_article/svglov3.clo +113 -0
  95. data/examples/latex_templates/Test-springer_article/svjour3.cls +1431 -0
  96. data/examples/rmarkdown/svm-rmarkdown-anon-ms-example/svm-rmarkdown-anon-ms-example.Rmd +73 -0
  97. data/examples/rmarkdown/svm-rmarkdown-anon-ms-example/svm-rmarkdown-anon-ms-example.pdf +0 -0
  98. data/examples/rmarkdown/svm-rmarkdown-article-example/svm-rmarkdown-article-example.Rmd +382 -0
  99. data/examples/rmarkdown/svm-rmarkdown-article-example/svm-rmarkdown-article-example.pdf +0 -0
  100. data/examples/rmarkdown/svm-rmarkdown-beamer-example/svm-rmarkdown-beamer-example.Rmd +164 -0
  101. data/examples/rmarkdown/svm-rmarkdown-beamer-example/svm-rmarkdown-beamer-example.pdf +0 -0
  102. data/examples/rmarkdown/svm-rmarkdown-cv/svm-rmarkdown-cv.Rmd +92 -0
  103. data/examples/rmarkdown/svm-rmarkdown-cv/svm-rmarkdown-cv.pdf +0 -0
  104. data/examples/rmarkdown/svm-rmarkdown-syllabus-example/attend-grade-relationships.csv +482 -0
  105. data/examples/rmarkdown/svm-rmarkdown-syllabus-example/svm-rmarkdown-syllabus-example.Rmd +280 -0
  106. data/examples/rmarkdown/svm-rmarkdown-syllabus-example/svm-rmarkdown-syllabus-example.pdf +0 -0
  107. data/examples/rmarkdown/svm-xaringan-example/svm-xaringan-example.Rmd +386 -0
  108. data/lib/R_interface/r.rb +1 -1
  109. data/lib/R_interface/r_libs.R +1 -1
  110. data/lib/R_interface/r_methods.rb +10 -0
  111. data/lib/R_interface/rpkg.rb +1 -0
  112. data/lib/R_interface/rsupport.rb +4 -6
  113. data/lib/gknit.rb +2 -0
  114. data/lib/gknit/draft.rb +105 -0
  115. data/lib/gknit/knitr_engine.rb +0 -33
  116. data/lib/util/exec_ruby.rb +1 -27
  117. data/specs/figures/bg.jpeg +0 -0
  118. data/specs/figures/bg.png +0 -0
  119. data/specs/figures/dose_len.png +0 -0
  120. data/specs/figures/no_args.jpeg +0 -0
  121. data/specs/figures/no_args.png +0 -0
  122. data/specs/figures/width_height.jpeg +0 -0
  123. data/specs/figures/width_height.png +0 -0
  124. data/specs/figures/width_height_units1.jpeg +0 -0
  125. data/specs/figures/width_height_units1.png +0 -0
  126. data/specs/figures/width_height_units2.jpeg +0 -0
  127. data/specs/figures/width_height_units2.png +0 -0
  128. data/specs/r_dataframe.spec.rb +11 -11
  129. data/specs/ruby_expression.spec.rb +1 -0
  130. data/specs/tmp.rb +41 -20
  131. data/version.rb +1 -1
  132. metadata +73 -35
  133. data/blogs/galaaz_ggplot/galaaz_ggplot.aux +0 -41
  134. data/blogs/galaaz_ggplot/galaaz_ggplot.out +0 -10
  135. data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-latex/midwest_rb.pdf +0 -0
  136. data/blogs/galaaz_ggplot/galaaz_ggplot_files/figure-latex/scatter_plot_rb.pdf +0 -0
  137. data/blogs/gknit/gknit.md +0 -1430
  138. data/blogs/gknit/gknit.tex +0 -1358
  139. data/blogs/manual/graph.rb +0 -29
  140. data/blogs/nse_dplyr/nse_dplyr.tex +0 -1373
  141. data/blogs/ruby_plot/ruby_plot.Rmd_external_figs +0 -662
  142. data/blogs/ruby_plot/ruby_plot_files/figure-html/dose_len.svg +0 -57
  143. data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_delivery.svg +0 -106
  144. data/blogs/ruby_plot/ruby_plot_files/figure-html/facet_by_dose.svg +0 -110
  145. data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color.svg +0 -174
  146. data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_by_delivery_color2.svg +0 -236
  147. data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_jitter.svg +0 -296
  148. data/blogs/ruby_plot/ruby_plot_files/figure-html/facets_with_points.svg +0 -236
  149. data/blogs/ruby_plot/ruby_plot_files/figure-html/final_box_plot.svg +0 -218
  150. data/blogs/ruby_plot/ruby_plot_files/figure-html/final_violin_plot.svg +0 -128
  151. data/blogs/ruby_plot/ruby_plot_files/figure-html/violin_with_jitter.svg +0 -150
  152. data/blogs/ruby_plot/ruby_plot_files/figure-latex/dose_len.png +0 -0
  153. data/blogs/ruby_plot/ruby_plot_files/figure-latex/facet_by_delivery.png +0 -0
  154. data/blogs/ruby_plot/ruby_plot_files/figure-latex/facet_by_dose.png +0 -0
  155. data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_by_delivery_color.png +0 -0
  156. data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_by_delivery_color2.png +0 -0
  157. data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_with_decorations.png +0 -0
  158. data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_with_jitter.png +0 -0
  159. data/blogs/ruby_plot/ruby_plot_files/figure-latex/facets_with_points.png +0 -0
  160. data/blogs/ruby_plot/ruby_plot_files/figure-latex/final_box_plot.png +0 -0
  161. data/blogs/ruby_plot/ruby_plot_files/figure-latex/final_violin_plot.png +0 -0
  162. data/blogs/ruby_plot/ruby_plot_files/figure-latex/violin_with_jitter.png +0 -0
  163. data/examples/paper/paper.rb +0 -36
@@ -1,41 +0,0 @@
1
- \relax
2
- \providecommand\hyper@newdestlabel[2]{}
3
- \providecommand\HyperFirstAtBeginDocument{\AtBeginDocument}
4
- \HyperFirstAtBeginDocument{\ifx\hyper@anchor\@undefined
5
- \global\let\oldcontentsline\contentsline
6
- \gdef\contentsline#1#2#3#4{\oldcontentsline{#1}{#2}{#3}}
7
- \global\let\oldnewlabel\newlabel
8
- \gdef\newlabel#1#2{\newlabelxx{#1}#2}
9
- \gdef\newlabelxx#1#2#3#4#5#6{\oldnewlabel{#1}{{#2}{#3}}}
10
- \AtEndDocument{\ifx\hyper@anchor\@undefined
11
- \let\contentsline\oldcontentsline
12
- \let\newlabel\oldnewlabel
13
- \fi}
14
- \fi}
15
- \global\let\hyper@last\relax
16
- \gdef\HyperFirstAtBeginDocument#1{#1}
17
- \providecommand\HyField@AuxAddToFields[1]{}
18
- \providecommand\HyField@AuxAddToCoFields[2]{}
19
- \@writefile{toc}{\contentsline {section}{\numberline {1}Introduction}{1}{section.1}}
20
- \newlabel{introduction}{{1}{1}{Introduction}{section.1}{}}
21
- \@writefile{toc}{\contentsline {subsection}{\numberline {1.1}What does Galaaz mean}{2}{subsection.1.1}}
22
- \newlabel{what-does-galaaz-mean}{{1.1}{2}{What does Galaaz mean}{subsection.1.1}{}}
23
- \@writefile{toc}{\contentsline {section}{\numberline {2}Galaaz Demo}{2}{section.2}}
24
- \newlabel{galaaz-demo}{{2}{2}{Galaaz Demo}{section.2}{}}
25
- \@writefile{toc}{\contentsline {subsection}{\numberline {2.1}Prerequisites}{2}{subsection.2.1}}
26
- \newlabel{prerequisites}{{2.1}{2}{Prerequisites}{subsection.2.1}{}}
27
- \@writefile{toc}{\contentsline {subsection}{\numberline {2.2}Preparation}{3}{subsection.2.2}}
28
- \newlabel{preparation}{{2.2}{3}{Preparation}{subsection.2.2}{}}
29
- \@writefile{toc}{\contentsline {subsection}{\numberline {2.3}Running the demo}{3}{subsection.2.3}}
30
- \newlabel{running-the-demo}{{2.3}{3}{Running the demo}{subsection.2.3}{}}
31
- \@writefile{toc}{\contentsline {subsection}{\numberline {2.4}Running other demos}{3}{subsection.2.4}}
32
- \newlabel{running-other-demos}{{2.4}{3}{Running other demos}{subsection.2.4}{}}
33
- \@writefile{toc}{\contentsline {section}{\numberline {3}The demo code}{3}{section.3}}
34
- \newlabel{the-demo-code}{{3}{3}{The demo code}{section.3}{}}
35
- \@writefile{toc}{\contentsline {section}{\numberline {4}An extension to the example}{5}{section.4}}
36
- \newlabel{an-extension-to-the-example}{{4}{5}{An extension to the example}{section.4}{}}
37
- \@writefile{toc}{\contentsline {section}{\numberline {5}Conclusion}{9}{section.5}}
38
- \newlabel{conclusion}{{5}{9}{Conclusion}{section.5}{}}
39
- \newlabel{LastPage}{{}{10}{}{page.10}{}}
40
- \xdef\lastpage@lastpage{10}
41
- \xdef\lastpage@lastpageHy{10}
@@ -1,10 +0,0 @@
1
- \BOOKMARK [1][-]{section.1}{\376\377\000I\000n\000t\000r\000o\000d\000u\000c\000t\000i\000o\000n}{}% 1
2
- \BOOKMARK [2][-]{subsection.1.1}{\376\377\000W\000h\000a\000t\000\040\000d\000o\000e\000s\000\040\000G\000a\000l\000a\000a\000z\000\040\000m\000e\000a\000n}{section.1}% 2
3
- \BOOKMARK [1][-]{section.2}{\376\377\000G\000a\000l\000a\000a\000z\000\040\000D\000e\000m\000o}{}% 3
4
- \BOOKMARK [2][-]{subsection.2.1}{\376\377\000P\000r\000e\000r\000e\000q\000u\000i\000s\000i\000t\000e\000s}{section.2}% 4
5
- \BOOKMARK [2][-]{subsection.2.2}{\376\377\000P\000r\000e\000p\000a\000r\000a\000t\000i\000o\000n}{section.2}% 5
6
- \BOOKMARK [2][-]{subsection.2.3}{\376\377\000R\000u\000n\000n\000i\000n\000g\000\040\000t\000h\000e\000\040\000d\000e\000m\000o}{section.2}% 6
7
- \BOOKMARK [2][-]{subsection.2.4}{\376\377\000R\000u\000n\000n\000i\000n\000g\000\040\000o\000t\000h\000e\000r\000\040\000d\000e\000m\000o\000s}{section.2}% 7
8
- \BOOKMARK [1][-]{section.3}{\376\377\000T\000h\000e\000\040\000d\000e\000m\000o\000\040\000c\000o\000d\000e}{}% 8
9
- \BOOKMARK [1][-]{section.4}{\376\377\000A\000n\000\040\000e\000x\000t\000e\000n\000s\000i\000o\000n\000\040\000t\000o\000\040\000t\000h\000e\000\040\000e\000x\000a\000m\000p\000l\000e}{}% 9
10
- \BOOKMARK [1][-]{section.5}{\376\377\000C\000o\000n\000c\000l\000u\000s\000i\000o\000n}{}% 10
@@ -1,1430 +0,0 @@
1
- ---
2
- title: "How to do reproducible research in Ruby with gKnit"
3
- author:
4
- - "Rodrigo Botafogo"
5
- - "Daniel Mossé - University of Pittsburgh"
6
- tags: [Tech, Data Science, Ruby, R, GraalVM]
7
- date: "29/04/2019"
8
- bibliography: stats.bib
9
- output:
10
- html_document:
11
- self_contained: true
12
- keep_md: true
13
- pdf_document:
14
- includes:
15
- in_header: ["../../sty/galaaz.sty"]
16
- number_sections: yes
17
- ---
18
-
19
-
20
-
21
- # Introduction
22
-
23
- The idea of "literate programming" was first introduced by Donald Knuth in the
24
- 1980's [@Knuth:literate_programming].
25
- The main intention of this approach was to develop software interspersing macro snippets,
26
- traditional source code, and a natural language such as English in a document
27
- that could be compiled into
28
- executable code and at the same time easily read by a human developer. According to Knuth
29
- "The practitioner of
30
- literate programming can be regarded as an essayist, whose main concern is with exposition
31
- and excellence of style."
32
-
33
- The idea of literate programming evolved into the idea of reproducible research, in which
34
- all the data, software code, documentation, graphics etc. needed to reproduce the research
35
- and its reports could be included in a
36
- single document or set of documents that when distributed to peers could be rerun generating
37
- the same output and reports.
38
-
39
- The R community has put a great deal of effort in reproducible research. In 2002, Sweave was
40
- introduced and it allowed mixing R code with Latex generating high quality PDF documents. A
41
- Sweave document could include code, the results of executing the code, graphics and text
42
- such that it contained the whole narrative to reproduce the research. In
43
- 2012, Knitr, developed by Yihui Xie from RStudio was released to replace Sweave and to
44
- consolidate in one single package the many extensions and add-on packages that
45
- were necessary for Sweave.
46
-
47
- With Knitr, __R markdown__ was also developed, an extension to the
48
- Markdown format. With __R markdown__ and Knitr it is possible to generate reports in a multitude
49
- of formats such as HTML, markdown, Latex, PDF, dvi, etc. __R markdown__ also allows the use of
50
- multiple programming languages such as R, Ruby, Python, etc. in the same document.
51
-
52
- In __R markdown__, text is interspersed with
53
- code chunks that can be executed and both the code and its results can become
54
- part of the final report. Although __R markdown__ allows multiple programming languages in the
55
- same document, only R and Python (with
56
- the reticulate package) can persist variables between chunks. For other languages, such as
57
- Ruby, every chunk will start a new process and thus all data is lost between chunks, unless it
58
- is somehow stored in a data file that is read by the next chunk.
59
-
60
- Being able to persist data
61
- between chunks is critical for literate programming otherwise the flow of the narrative is lost
62
- by all the effort of having to save data and then reload it. Although this might, at first, seem like
63
- a small nuisance, not being able to persist data between chunks is a major issue. For example, let's
64
- take a look at the following simple example in which we want to show how to create a list and the
65
- use it. Let's first assume that data cannot be persisted between chunks. In the next chunk we
66
- create a list, then we would need to save it to file, but to save it, we need somehow to marshal the
67
- data into a binary format:
68
-
69
-
70
- ```ruby
71
- lst = R.list(a: 1, b: 2, c: 3)
72
- lst.saveRDS("lst.rds")
73
- ```
74
- then, on the next chunk, where variable 'lst' is used, we need to read back it's value
75
-
76
-
77
- ```ruby
78
- lst = R.readRDS("lst.rds")
79
- puts lst
80
- ```
81
-
82
- ```
83
- ## $a
84
- ## [1] 1
85
- ##
86
- ## $b
87
- ## [1] 2
88
- ##
89
- ## $c
90
- ## [1] 3
91
- ```
92
-
93
- Now, any single code has dozens of variables that we might want to use and reuse between chunks.
94
- Clearly, such an approach becomes quickly unmanageable. Probably, because of
95
- this problem, it is very rare to see any __R markdown__ document in the Ruby community.
96
-
97
- When variables can be used accross chunks, then no overhead is needed:
98
-
99
-
100
- ```ruby
101
- lst = R.list(a: 1, b: 2, c: 3)
102
- # any other code can be added here
103
- ```
104
-
105
-
106
- ```ruby
107
- puts lst
108
- ```
109
-
110
- ```
111
- ## $a
112
- ## [1] 1
113
- ##
114
- ## $b
115
- ## [1] 2
116
- ##
117
- ## $c
118
- ## [1] 3
119
- ```
120
-
121
- In the Python community, the same effort to have code and text in an integrated environment
122
- started around the first decade of 2000. In 2006 iPython 0.7.2 was released. In 2014,
123
- Fernando Pérez, spun off project Jupyter from iPython creating a web-based interactive
124
- computation environment. Jupyter can now be used with many languages, including Ruby with the
125
- iruby gem (https://github.com/SciRuby/iruby). In order to have multiple languages in a Jupyter
126
- notebook the SoS kernel was developed (https://vatlab.github.io/sos-docs/).
127
-
128
- # gKnitting a Document
129
-
130
- This document describes gKnit. gKnit is based on knitr and __R markdown__ and can knit a document
131
- written both in Ruby and/or R and output it in any of the available formats of __R markdown__. gKnit
132
- allows ruby developers to do literate programming and reproducible research by allowing them to
133
- have in a single document, text and code.
134
-
135
- gKnit runs atop of GraalVM, and Galaaz (an integration
136
- library between Ruby and R - see bellow). In gKnit, Ruby variables are persisted between
137
- chunks, making it an ideal solution for literate programming in this language. Also,
138
- since it is based on Galaaz, Ruby chunks can have access to R variables and Polyglot Programming
139
- with Ruby and R is quite natural.
140
-
141
- Galaaz has already been describe in the following posts:
142
-
143
- * https://towardsdatascience.com/ruby-plotting-with-galaaz-an-example-of-tightly-coupling-ruby-and-r-in-graalvm-520b69e21021.
144
- * https://medium.freecodecamp.org/how-to-make-beautiful-ruby-plots-with-galaaz-320848058857
145
-
146
- This is not a blog post on __R markdown__, and the interested user is directed to the following links
147
- for detailed information on its capabilities and use.
148
-
149
- * https://rmarkdown.rstudio.com/ or
150
- * https://bookdown.org/yihui/rmarkdown/
151
-
152
- In this post, we will describe just the main aspects of __R markdown__, so the user can start
153
- gKnitting Ruby and R documents quickly.
154
-
155
- ## The Yaml header
156
-
157
- An __R markdown__ document should start with a Yaml header and be stored in a file with
158
- '.Rmd' extension. This document has the following header for gKitting an HTML document.
159
-
160
- ```
161
- ---
162
- title: "How to do reproducible research in Ruby with gKnit"
163
- author:
164
- - "Rodrigo Botafogo"
165
- - "Daniel Mossé - University of Pittsburgh"
166
- tags: [Tech, Data Science, Ruby, R, GraalVM]
167
- date: "20/02/2019"
168
- output:
169
- html_document:
170
- self_contained: true
171
- keep_md: true
172
- pdf_document:
173
- includes:
174
- in_header: ["../../sty/galaaz.sty"]
175
- number_sections: yes
176
- ---
177
- ```
178
-
179
- For more information on the options in the Yaml header, check https://bookdown.org/yihui/rmarkdown/html-document.html.
180
-
181
- ## __R Markdown__ formatting
182
-
183
- Document formatting can be done with simple markups such as:
184
-
185
- ### Headers
186
-
187
- ```
188
- # Header 1
189
-
190
- ## Header 2
191
-
192
- ### Header 3
193
-
194
- ```
195
-
196
- ### Lists
197
-
198
- ```
199
- Unordered lists:
200
-
201
- * Item 1
202
- * Item 2
203
- + Item 2a
204
- + Item 2b
205
- ```
206
-
207
- ```
208
- Ordered Lists
209
-
210
- 1. Item 1
211
- 2. Item 2
212
- 3. Item 3
213
- + Item 3a
214
- + Item 3b
215
- ```
216
-
217
- For more R markdown formatting go to https://rmarkdown.rstudio.com/authoring_basics.html.
218
-
219
- ### R chunks
220
-
221
- Running and executing Ruby and R code is actually what really interests us is this blog.
222
- Inserting a code chunk is done by adding code in a block delimited by three back ticks
223
- followed by an open
224
- curly brace ('{') followed with the engine name (r, ruby, rb, include, ...), an
225
- any optional chunk_label and options, as shown bellow:
226
-
227
- ````
228
- ```{engine_name [chunk_label], [chunk_options]}
229
- ```
230
- ````
231
-
232
- for instance, let's add an R chunk to the document labeled 'first_r_chunk'. This is
233
- a very simple code just to create a variable and print it out, as follows:
234
-
235
- ````
236
- ```{r first_r_chunk}
237
- vec <- c(1, 2, 3)
238
- print(vec)
239
- ```
240
- ````
241
-
242
- If this block is added to an __R markdown__ document and gKnitted the result will be:
243
-
244
-
245
- ```r
246
- vec <- c(1, 2, 3)
247
- print(vec)
248
- ```
249
-
250
- ```
251
- ## [1] 1 2 3
252
- ```
253
-
254
- Now let's say that we want to do some analysis in the code, but just print the result and not the
255
- code itself. For this, we need to add the option 'echo = FALSE'.
256
-
257
- ````
258
- ```{r second_r_chunk, echo = FALSE}
259
- vec2 <- c(10, 20, 30)
260
- vec3 <- vec * vec2
261
- print(vec3)
262
- ```
263
- ````
264
- Here is how this block will show up in the document. Observe that the code is not shown
265
- and we only see the execution result in a white box
266
-
267
-
268
- ```
269
- ## [1] 10 40 90
270
- ```
271
-
272
- A description of the available chunk options can be found in https://yihui.name/knitr/.
273
-
274
- Let's add another R chunk with a function definition. In this example, a vector
275
- 'r_vec' is created and
276
- a new function 'reduce_sum' is defined. The chunk specification is
277
-
278
- ````
279
- ```{r data_creation}
280
- r_vec <- c(1, 2, 3, 4, 5)
281
-
282
- reduce_sum <- function(...) {
283
- Reduce(sum, as.list(...))
284
- }
285
- ```
286
- ````
287
-
288
- and this is how it will look like once executed. From now on, to be concise in the
289
- presentation we will not show chunk definitions any longer.
290
-
291
-
292
-
293
- ```r
294
- r_vec <- c(1, 2, 3, 4, 5)
295
-
296
- reduce_sum <- function(...) {
297
- Reduce(sum, as.list(...))
298
- }
299
- ```
300
-
301
- We can, possibly in another chunk, access the vector and call the function as follows:
302
-
303
-
304
- ```r
305
- print(r_vec)
306
- ```
307
-
308
- ```
309
- ## [1] 1 2 3 4 5
310
- ```
311
-
312
- ```r
313
- print(reduce_sum(r_vec))
314
- ```
315
-
316
- ```
317
- ## [1] 15
318
- ```
319
- ### R Graphics with ggplot
320
-
321
- In the following chunk, we create a bubble chart in R using ggplot and include it in
322
- this document. Note that there is no directive in the code to include the image, this
323
- occurs automatically. The 'mpg' dataframe is natively available to R and to Galaaz as
324
- well.
325
-
326
- For the reader not knowledgeable of ggplot, ggplot is a graphics library based on "the
327
- grammar of graphics" [@Wilkinson:grammar_of_graphics]. The idea of the grammar of graphics
328
- is to build a graphics by adding layers to the plot. More information can be found in
329
- https://towardsdatascience.com/a-comprehensive-guide-to-the-grammar-of-graphics-for-effective-visualization-of-multi-dimensional-1f92b4ed4149.
330
-
331
- In the plot bellow the 'mpg' dataset from base R is used. "The data concerns city-cycle fuel
332
- consumption in miles per gallon, to be predicted in terms of 3 multivalued discrete and 5
333
- continuous attributes." (Quinlan, 1993)
334
-
335
- First, the 'mpg' dataset if filtered to extract only cars from the following manumactures: Audi, Ford,
336
- Honda, and Hyundai and stored in the 'mpg_select' variable. Then, the selected dataframe is passed
337
- to the ggplot function specifying in the aesthetic method (aes) that 'displacement' (disp) should
338
- be plotted in the 'x' axis and 'city mileage' should be on the 'y' axis. In the 'labs' layer we
339
- pass the 'title' and 'subtitle' for the plot. To the basic plot 'g', geom\_jitter is added, that
340
- plots cars from the same manufactures with the same color (col=manufactures) and the size of the
341
- car point equal its high way consumption (size = hwy). Finally, a last layer is plotter containing
342
- a linear regression line (method = "lm") for every manufacturer.
343
-
344
-
345
- ```r
346
- # load package and data
347
- library(ggplot2)
348
- data(mpg, package="ggplot2")
349
-
350
- mpg_select <- mpg[mpg$manufacturer %in% c("audi", "ford", "honda", "hyundai"), ]
351
-
352
- # Scatterplot
353
- theme_set(theme_bw()) # pre-set the bw theme.
354
- g <- ggplot(mpg_select, aes(displ, cty)) +
355
- labs(subtitle="mpg: Displacement vs City Mileage",
356
- title="Bubble chart")
357
-
358
- g + geom_jitter(aes(col=manufacturer, size=hwy)) +
359
- geom_smooth(aes(col=manufacturer), method="lm", se=F)
360
- ```
361
-
362
- ![](/home/rbotafogo/desenv/galaaz/blogs/gknit/gknit_files/figure-html/bubble-1.png)<!-- -->
363
-
364
- ### Ruby chunks
365
-
366
-
367
- Including a Ruby chunk is just as easy as including an R chunk in the document: just
368
- change the name of the engine to 'ruby'. It is also possible to pass chunk options
369
- to the Ruby engine; however, this version does not accept all the options that are
370
- available to R chunks. Future versions will add those options.
371
-
372
- ````
373
- ```{ruby first_ruby_chunk}
374
- ```
375
- ````
376
-
377
- In this example, the ruby chunk is called 'first_ruby_chunk'. One important
378
- aspect of chunk labels is that they cannot be duplicated. If a chunk label is
379
- duplicated, gKnit will stop with an error.
380
-
381
- In the following chunk, variable 'a', 'b' and 'c' are standard Ruby variables
382
- and 'vec' and 'vec2' are two vectors created by calling the 'c' method on the
383
- R module.
384
-
385
- In Galaaz, the R module allows us to access R functions transparently. The 'c'
386
- function in R, is a function that concatenates its arguments making a vector.
387
-
388
- It
389
- should be clear that there is no requirement in gknit to call or use any R
390
- functions. gKnit will knit standard Ruby code, or even general text without
391
- any code.
392
-
393
-
394
- ```ruby
395
- a = [1, 2, 3]
396
- b = "US$ 250.000"
397
- c = "The 'outputs' function"
398
-
399
- vec = R.c(1, 2, 3)
400
- vec2 = R.c(10, 20, 30)
401
- ```
402
-
403
- In the next block, variables 'a', 'vec' and 'vec2' are used and printed.
404
-
405
-
406
- ```ruby
407
- puts a
408
- puts vec * vec2
409
- ```
410
-
411
- ```
412
- ## 1
413
- ## 2
414
- ## 3
415
- ## [1] 10 40 90
416
- ```
417
-
418
- Note that 'a' is a standard Ruby Array and 'vec' and 'vec2' are vectors that behave accordingly,
419
- where multiplication works as expected.
420
-
421
-
422
- ### Accessing R from Ruby
423
-
424
- One of the nice aspects of Galaaz on GraalVM, is that variables and functions defined in R, can
425
- be easily accessed from Ruby. This next chunk, reads data from R and uses the 'reduce_sum'
426
- function defined previously. To access an R variable from Ruby the '~' function should be
427
- applied to the Ruby symbol representing the R variable. Since the R variable is called 'r_vec',
428
- in Ruby, the symbol to acess it is ':r_vec' and thus '~:r_vec' retrieves the value of the
429
- variable.
430
-
431
-
432
- ```ruby
433
- puts ~:r_vec
434
- ```
435
-
436
- ```
437
- ## [1] 1 2 3 4 5
438
- ```
439
-
440
- In order to call an R function, the 'R.' module is used as follows
441
-
442
-
443
- ```ruby
444
- puts R.reduce_sum(~:r_vec)
445
- ```
446
-
447
- ```
448
- ## [1] 15
449
- ```
450
-
451
- ### Ruby Plotting
452
-
453
- We have seen an example of plotting with R. Plotting with Ruby does not require
454
- anything different from plotting with R. In the following example, we plot a
455
- diverging bar graph using the 'mtcars' dataframe from R. This data was extracted
456
- from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects
457
- of automobile design and performance for 32 automobiles (1973–74 models). The
458
- ten aspects are:
459
-
460
- * mpg: Miles/(US) gallon
461
- * cyl: Number of cylinders
462
- * disp: Displacement (cu.in.)
463
- * hp: Gross horsepower
464
- * drat: Rear axle ratio
465
- * wt: Weight (1000 lbs)
466
- * qsec: 1/4 mile time
467
- * vs: Engine (0 = V-shaped, 1 = straight)
468
- * am: Transmission (0 = automatic, 1 = manual)
469
- * gear: Number of forward gears
470
- * carb: Number of carburetors
471
-
472
-
473
-
474
- ```ruby
475
- # copy the R variable :mtcars to the Ruby mtcars variable
476
- mtcars = ~:mtcars
477
-
478
- # create a new column 'car_name' to store the car names so that it can be
479
- # used for plotting. The 'rownames' of the data frame cannot be used as
480
- # data for plotting
481
- mtcars.car_name = R.rownames(:mtcars)
482
-
483
- # compute normalized mpg and add it to a new column called mpg_z
484
- # Note that the mean value for mpg can be obtained by calling the 'mean'
485
- # function on the vector 'mtcars.mpg'. The same with the standard
486
- # deviation 'sd'. The vector is then rounded to two digits with 'round 2'
487
- mtcars.mpg_z = ((mtcars.mpg - mtcars.mpg.mean)/mtcars.mpg.sd).round 2
488
-
489
- # create a new column 'mpg_type'. Function 'ifelse' is a vectorized function
490
- # that looks at every element of the mpg_z vector and if the value is below
491
- # 0, returns 'below', otherwise returns 'above'
492
- mtcars.mpg_type = (mtcars.mpg_z < 0).ifelse("below", "above")
493
-
494
- # order the mtcar data set by the mpg_z vector from smaler to larger values
495
- mtcars = mtcars[mtcars.mpg_z.order, :all]
496
-
497
- # convert the car_name column to a factor to retain sorted order in plot
498
- mtcars.car_name = mtcars.car_name.factor levels: mtcars.car_name
499
-
500
- # let's look at the first records of the final data frame
501
- puts mtcars.head
502
- ```
503
-
504
- ```
505
- ## mpg cyl disp hp drat wt qsec vs am gear carb
506
- ## Cadillac Fleetwood 10.4 8 472 205 2.93 5.250 17.98 0 0 3 4
507
- ## Lincoln Continental 10.4 8 460 215 3.00 5.424 17.82 0 0 3 4
508
- ## Camaro Z28 13.3 8 350 245 3.73 3.840 15.41 0 0 3 4
509
- ## Duster 360 14.3 8 360 245 3.21 3.570 15.84 0 0 3 4
510
- ## Chrysler Imperial 14.7 8 440 230 3.23 5.345 17.42 0 0 3 4
511
- ## Maserati Bora 15.0 8 301 335 3.54 3.570 14.60 0 1 5 8
512
- ## car_name mpg_z mpg_type
513
- ## Cadillac Fleetwood Cadillac Fleetwood -1.61 below
514
- ## Lincoln Continental Lincoln Continental -1.61 below
515
- ## Camaro Z28 Camaro Z28 -1.13 below
516
- ## Duster 360 Duster 360 -0.96 below
517
- ## Chrysler Imperial Chrysler Imperial -0.89 below
518
- ## Maserati Bora Maserati Bora -0.84 below
519
- ```
520
-
521
-
522
- ```ruby
523
- require 'ggplot'
524
-
525
- puts mtcars.ggplot(E.aes(x: :car_name, y: :mpg_z, label: :mpg_z)) +
526
- R.geom_bar(E.aes(fill: :mpg_type), stat: 'identity', width: 0.5) +
527
- R.scale_fill_manual(name: 'Mileage',
528
- labels: R.c('Above Average', 'Below Average'),
529
- values: R.c('above': '#00ba38', 'below': '#f8766d')) +
530
- R.labs(subtitle: "Normalised mileage from 'mtcars'",
531
- title: "Diverging Bars") +
532
- R.coord_flip
533
- ```
534
-
535
-
536
- ![](/home/rbotafogo/desenv/galaaz/blogs/gknit/gknit_files/figure-html/diverging_bar.png)<!-- -->
537
-
538
- ### Inline Ruby code
539
-
540
- When using a Ruby chunk, the code and the output are formatted in blocks as seen above.
541
- This formatting is not always desired. Sometimes, we want to have the results of the
542
- Ruby evaluation included in the middle of a phrase. gKnit allows adding inline Ruby code
543
- with the 'rb' engine. The following chunk specification will
544
- create and inline Ruby text:
545
-
546
- ````
547
- This is some text with inline Ruby accessing variable 'b' which has value:
548
- ```{rb puts b}
549
- ```
550
- and is followed by some other text!
551
- ````
552
-
553
- <div style="margin-bottom:30px;">
554
- </div>
555
-
556
- This is some text with inline Ruby accessing variable 'b' which has value:
557
- US$ 250.000
558
- and is followed by some other text!
559
-
560
- <div style="margin-bottom:30px;">
561
- </div>
562
-
563
- Note that it is important not to add any new line before of after the code
564
- block if we want everything to be in only one line, resulting in the following sentence
565
- with inline Ruby code.
566
-
567
-
568
- ### The 'outputs' function
569
-
570
- He have previously used the standard 'puts' method in Ruby chunks in order produce
571
- output. The result of a 'puts', as seen in all previous chunks that use it, is formatted
572
- inside a white box that
573
- follows the code block. Many times however, we would like to do some processing in the
574
- Ruby chunk and have the result of this processing generate and output that is
575
- "included" in the document as if we had typed it in __R markdown__ document.
576
-
577
- For example, suppose we want to create a new heading in our document, but the heading
578
- phrase is the result of some code processing: maybe it's the first line of a file we are
579
- going to read. Method 'outputs' adds its output as if typed in the __R markdown__ document.
580
-
581
- Take now a look at variable 'c' (it was defined in a previous block above) as
582
- 'c = "The 'outputs' function". "The 'outputs' function" is actually the name of this
583
- section and it was created using the 'outputs' function inside a Ruby chunk.
584
-
585
- The ruby chunk to generate this heading is:
586
-
587
- ````
588
- ```{ruby heading}
589
- outputs "### #{c}"
590
- ```
591
- ````
592
-
593
- The three '###' is the way we add a Heading 3 in __R markdown__.
594
-
595
-
596
- ### HTML Output from Ruby Chunks
597
-
598
- We've just seen the use of method 'outputs' to add text to the the __R markdown__
599
- document. This technique can also be used to add HTML code to the document. In
600
- __R markdown__, any html code typed directly in the document will be properly rendered.
601
- Here, for instance, is a table definition in HTML and its output in the document:
602
-
603
- ```
604
- <table style="width:100%">
605
- <tr>
606
- <th>Firstname</th>
607
- <th>Lastname</th>
608
- <th>Age</th>
609
- </tr>
610
- <tr>
611
- <td>Jill</td>
612
- <td>Smith</td>
613
- <td>50</td>
614
- </tr>
615
- <tr>
616
- <td>Eve</td>
617
- <td>Jackson</td>
618
- <td>94</td>
619
- </tr>
620
- </table>
621
- ```
622
- <div style="margin-bottom:30px;">
623
- </div>
624
-
625
- <table style="width:100%">
626
- <tr>
627
- <th>Firstname</th>
628
- <th>Lastname</th>
629
- <th>Age</th>
630
- </tr>
631
- <tr>
632
- <td>Jill</td>
633
- <td>Smith</td>
634
- <td>50</td>
635
- </tr>
636
- <tr>
637
- <td>Eve</td>
638
- <td>Jackson</td>
639
- <td>94</td>
640
- </tr>
641
- </table>
642
-
643
- <div style="margin-bottom:30px;">
644
- </div>
645
-
646
- But manually creating HTML output is not always easy or desirable, specially
647
- if we intend the document to be rendered in other formats, for example, as Latex.
648
- Also, The above
649
- table looks ugly. The 'kableExtra' library is a great library for
650
- creating beautiful tables. Take a look at https://cran.r-project.org/web/packages/kableExtra/vignettes/awesome_table_in_html.html
651
-
652
- In the next chunk, we output the 'mtcars' dataframe from R in a nicely formatted
653
- table. Note that we retrieve the mtcars dataframe by using '~:mtcars'.
654
-
655
-
656
- ```ruby
657
- R.install_and_loads('kableExtra')
658
- outputs (~:mtcars).kable.kable_styling
659
- ```
660
-
661
- <table class="table" style="margin-left: auto; margin-right: auto;">
662
- <thead>
663
- <tr>
664
- <th style="text-align:left;"> </th>
665
- <th style="text-align:right;"> mpg </th>
666
- <th style="text-align:right;"> cyl </th>
667
- <th style="text-align:right;"> disp </th>
668
- <th style="text-align:right;"> hp </th>
669
- <th style="text-align:right;"> drat </th>
670
- <th style="text-align:right;"> wt </th>
671
- <th style="text-align:right;"> qsec </th>
672
- <th style="text-align:right;"> vs </th>
673
- <th style="text-align:right;"> am </th>
674
- <th style="text-align:right;"> gear </th>
675
- <th style="text-align:right;"> carb </th>
676
- </tr>
677
- </thead>
678
- <tbody>
679
- <tr>
680
- <td style="text-align:left;"> Mazda RX4 </td>
681
- <td style="text-align:right;"> 21.0 </td>
682
- <td style="text-align:right;"> 6 </td>
683
- <td style="text-align:right;"> 160.0 </td>
684
- <td style="text-align:right;"> 110 </td>
685
- <td style="text-align:right;"> 3.90 </td>
686
- <td style="text-align:right;"> 2.620 </td>
687
- <td style="text-align:right;"> 16.46 </td>
688
- <td style="text-align:right;"> 0 </td>
689
- <td style="text-align:right;"> 1 </td>
690
- <td style="text-align:right;"> 4 </td>
691
- <td style="text-align:right;"> 4 </td>
692
- </tr>
693
- <tr>
694
- <td style="text-align:left;"> Mazda RX4 Wag </td>
695
- <td style="text-align:right;"> 21.0 </td>
696
- <td style="text-align:right;"> 6 </td>
697
- <td style="text-align:right;"> 160.0 </td>
698
- <td style="text-align:right;"> 110 </td>
699
- <td style="text-align:right;"> 3.90 </td>
700
- <td style="text-align:right;"> 2.875 </td>
701
- <td style="text-align:right;"> 17.02 </td>
702
- <td style="text-align:right;"> 0 </td>
703
- <td style="text-align:right;"> 1 </td>
704
- <td style="text-align:right;"> 4 </td>
705
- <td style="text-align:right;"> 4 </td>
706
- </tr>
707
- <tr>
708
- <td style="text-align:left;"> Datsun 710 </td>
709
- <td style="text-align:right;"> 22.8 </td>
710
- <td style="text-align:right;"> 4 </td>
711
- <td style="text-align:right;"> 108.0 </td>
712
- <td style="text-align:right;"> 93 </td>
713
- <td style="text-align:right;"> 3.85 </td>
714
- <td style="text-align:right;"> 2.320 </td>
715
- <td style="text-align:right;"> 18.61 </td>
716
- <td style="text-align:right;"> 1 </td>
717
- <td style="text-align:right;"> 1 </td>
718
- <td style="text-align:right;"> 4 </td>
719
- <td style="text-align:right;"> 1 </td>
720
- </tr>
721
- <tr>
722
- <td style="text-align:left;"> Hornet 4 Drive </td>
723
- <td style="text-align:right;"> 21.4 </td>
724
- <td style="text-align:right;"> 6 </td>
725
- <td style="text-align:right;"> 258.0 </td>
726
- <td style="text-align:right;"> 110 </td>
727
- <td style="text-align:right;"> 3.08 </td>
728
- <td style="text-align:right;"> 3.215 </td>
729
- <td style="text-align:right;"> 19.44 </td>
730
- <td style="text-align:right;"> 1 </td>
731
- <td style="text-align:right;"> 0 </td>
732
- <td style="text-align:right;"> 3 </td>
733
- <td style="text-align:right;"> 1 </td>
734
- </tr>
735
- <tr>
736
- <td style="text-align:left;"> Hornet Sportabout </td>
737
- <td style="text-align:right;"> 18.7 </td>
738
- <td style="text-align:right;"> 8 </td>
739
- <td style="text-align:right;"> 360.0 </td>
740
- <td style="text-align:right;"> 175 </td>
741
- <td style="text-align:right;"> 3.15 </td>
742
- <td style="text-align:right;"> 3.440 </td>
743
- <td style="text-align:right;"> 17.02 </td>
744
- <td style="text-align:right;"> 0 </td>
745
- <td style="text-align:right;"> 0 </td>
746
- <td style="text-align:right;"> 3 </td>
747
- <td style="text-align:right;"> 2 </td>
748
- </tr>
749
- <tr>
750
- <td style="text-align:left;"> Valiant </td>
751
- <td style="text-align:right;"> 18.1 </td>
752
- <td style="text-align:right;"> 6 </td>
753
- <td style="text-align:right;"> 225.0 </td>
754
- <td style="text-align:right;"> 105 </td>
755
- <td style="text-align:right;"> 2.76 </td>
756
- <td style="text-align:right;"> 3.460 </td>
757
- <td style="text-align:right;"> 20.22 </td>
758
- <td style="text-align:right;"> 1 </td>
759
- <td style="text-align:right;"> 0 </td>
760
- <td style="text-align:right;"> 3 </td>
761
- <td style="text-align:right;"> 1 </td>
762
- </tr>
763
- <tr>
764
- <td style="text-align:left;"> Duster 360 </td>
765
- <td style="text-align:right;"> 14.3 </td>
766
- <td style="text-align:right;"> 8 </td>
767
- <td style="text-align:right;"> 360.0 </td>
768
- <td style="text-align:right;"> 245 </td>
769
- <td style="text-align:right;"> 3.21 </td>
770
- <td style="text-align:right;"> 3.570 </td>
771
- <td style="text-align:right;"> 15.84 </td>
772
- <td style="text-align:right;"> 0 </td>
773
- <td style="text-align:right;"> 0 </td>
774
- <td style="text-align:right;"> 3 </td>
775
- <td style="text-align:right;"> 4 </td>
776
- </tr>
777
- <tr>
778
- <td style="text-align:left;"> Merc 240D </td>
779
- <td style="text-align:right;"> 24.4 </td>
780
- <td style="text-align:right;"> 4 </td>
781
- <td style="text-align:right;"> 146.7 </td>
782
- <td style="text-align:right;"> 62 </td>
783
- <td style="text-align:right;"> 3.69 </td>
784
- <td style="text-align:right;"> 3.190 </td>
785
- <td style="text-align:right;"> 20.00 </td>
786
- <td style="text-align:right;"> 1 </td>
787
- <td style="text-align:right;"> 0 </td>
788
- <td style="text-align:right;"> 4 </td>
789
- <td style="text-align:right;"> 2 </td>
790
- </tr>
791
- <tr>
792
- <td style="text-align:left;"> Merc 230 </td>
793
- <td style="text-align:right;"> 22.8 </td>
794
- <td style="text-align:right;"> 4 </td>
795
- <td style="text-align:right;"> 140.8 </td>
796
- <td style="text-align:right;"> 95 </td>
797
- <td style="text-align:right;"> 3.92 </td>
798
- <td style="text-align:right;"> 3.150 </td>
799
- <td style="text-align:right;"> 22.90 </td>
800
- <td style="text-align:right;"> 1 </td>
801
- <td style="text-align:right;"> 0 </td>
802
- <td style="text-align:right;"> 4 </td>
803
- <td style="text-align:right;"> 2 </td>
804
- </tr>
805
- <tr>
806
- <td style="text-align:left;"> Merc 280 </td>
807
- <td style="text-align:right;"> 19.2 </td>
808
- <td style="text-align:right;"> 6 </td>
809
- <td style="text-align:right;"> 167.6 </td>
810
- <td style="text-align:right;"> 123 </td>
811
- <td style="text-align:right;"> 3.92 </td>
812
- <td style="text-align:right;"> 3.440 </td>
813
- <td style="text-align:right;"> 18.30 </td>
814
- <td style="text-align:right;"> 1 </td>
815
- <td style="text-align:right;"> 0 </td>
816
- <td style="text-align:right;"> 4 </td>
817
- <td style="text-align:right;"> 4 </td>
818
- </tr>
819
- <tr>
820
- <td style="text-align:left;"> Merc 280C </td>
821
- <td style="text-align:right;"> 17.8 </td>
822
- <td style="text-align:right;"> 6 </td>
823
- <td style="text-align:right;"> 167.6 </td>
824
- <td style="text-align:right;"> 123 </td>
825
- <td style="text-align:right;"> 3.92 </td>
826
- <td style="text-align:right;"> 3.440 </td>
827
- <td style="text-align:right;"> 18.90 </td>
828
- <td style="text-align:right;"> 1 </td>
829
- <td style="text-align:right;"> 0 </td>
830
- <td style="text-align:right;"> 4 </td>
831
- <td style="text-align:right;"> 4 </td>
832
- </tr>
833
- <tr>
834
- <td style="text-align:left;"> Merc 450SE </td>
835
- <td style="text-align:right;"> 16.4 </td>
836
- <td style="text-align:right;"> 8 </td>
837
- <td style="text-align:right;"> 275.8 </td>
838
- <td style="text-align:right;"> 180 </td>
839
- <td style="text-align:right;"> 3.07 </td>
840
- <td style="text-align:right;"> 4.070 </td>
841
- <td style="text-align:right;"> 17.40 </td>
842
- <td style="text-align:right;"> 0 </td>
843
- <td style="text-align:right;"> 0 </td>
844
- <td style="text-align:right;"> 3 </td>
845
- <td style="text-align:right;"> 3 </td>
846
- </tr>
847
- <tr>
848
- <td style="text-align:left;"> Merc 450SL </td>
849
- <td style="text-align:right;"> 17.3 </td>
850
- <td style="text-align:right;"> 8 </td>
851
- <td style="text-align:right;"> 275.8 </td>
852
- <td style="text-align:right;"> 180 </td>
853
- <td style="text-align:right;"> 3.07 </td>
854
- <td style="text-align:right;"> 3.730 </td>
855
- <td style="text-align:right;"> 17.60 </td>
856
- <td style="text-align:right;"> 0 </td>
857
- <td style="text-align:right;"> 0 </td>
858
- <td style="text-align:right;"> 3 </td>
859
- <td style="text-align:right;"> 3 </td>
860
- </tr>
861
- <tr>
862
- <td style="text-align:left;"> Merc 450SLC </td>
863
- <td style="text-align:right;"> 15.2 </td>
864
- <td style="text-align:right;"> 8 </td>
865
- <td style="text-align:right;"> 275.8 </td>
866
- <td style="text-align:right;"> 180 </td>
867
- <td style="text-align:right;"> 3.07 </td>
868
- <td style="text-align:right;"> 3.780 </td>
869
- <td style="text-align:right;"> 18.00 </td>
870
- <td style="text-align:right;"> 0 </td>
871
- <td style="text-align:right;"> 0 </td>
872
- <td style="text-align:right;"> 3 </td>
873
- <td style="text-align:right;"> 3 </td>
874
- </tr>
875
- <tr>
876
- <td style="text-align:left;"> Cadillac Fleetwood </td>
877
- <td style="text-align:right;"> 10.4 </td>
878
- <td style="text-align:right;"> 8 </td>
879
- <td style="text-align:right;"> 472.0 </td>
880
- <td style="text-align:right;"> 205 </td>
881
- <td style="text-align:right;"> 2.93 </td>
882
- <td style="text-align:right;"> 5.250 </td>
883
- <td style="text-align:right;"> 17.98 </td>
884
- <td style="text-align:right;"> 0 </td>
885
- <td style="text-align:right;"> 0 </td>
886
- <td style="text-align:right;"> 3 </td>
887
- <td style="text-align:right;"> 4 </td>
888
- </tr>
889
- <tr>
890
- <td style="text-align:left;"> Lincoln Continental </td>
891
- <td style="text-align:right;"> 10.4 </td>
892
- <td style="text-align:right;"> 8 </td>
893
- <td style="text-align:right;"> 460.0 </td>
894
- <td style="text-align:right;"> 215 </td>
895
- <td style="text-align:right;"> 3.00 </td>
896
- <td style="text-align:right;"> 5.424 </td>
897
- <td style="text-align:right;"> 17.82 </td>
898
- <td style="text-align:right;"> 0 </td>
899
- <td style="text-align:right;"> 0 </td>
900
- <td style="text-align:right;"> 3 </td>
901
- <td style="text-align:right;"> 4 </td>
902
- </tr>
903
- <tr>
904
- <td style="text-align:left;"> Chrysler Imperial </td>
905
- <td style="text-align:right;"> 14.7 </td>
906
- <td style="text-align:right;"> 8 </td>
907
- <td style="text-align:right;"> 440.0 </td>
908
- <td style="text-align:right;"> 230 </td>
909
- <td style="text-align:right;"> 3.23 </td>
910
- <td style="text-align:right;"> 5.345 </td>
911
- <td style="text-align:right;"> 17.42 </td>
912
- <td style="text-align:right;"> 0 </td>
913
- <td style="text-align:right;"> 0 </td>
914
- <td style="text-align:right;"> 3 </td>
915
- <td style="text-align:right;"> 4 </td>
916
- </tr>
917
- <tr>
918
- <td style="text-align:left;"> Fiat 128 </td>
919
- <td style="text-align:right;"> 32.4 </td>
920
- <td style="text-align:right;"> 4 </td>
921
- <td style="text-align:right;"> 78.7 </td>
922
- <td style="text-align:right;"> 66 </td>
923
- <td style="text-align:right;"> 4.08 </td>
924
- <td style="text-align:right;"> 2.200 </td>
925
- <td style="text-align:right;"> 19.47 </td>
926
- <td style="text-align:right;"> 1 </td>
927
- <td style="text-align:right;"> 1 </td>
928
- <td style="text-align:right;"> 4 </td>
929
- <td style="text-align:right;"> 1 </td>
930
- </tr>
931
- <tr>
932
- <td style="text-align:left;"> Honda Civic </td>
933
- <td style="text-align:right;"> 30.4 </td>
934
- <td style="text-align:right;"> 4 </td>
935
- <td style="text-align:right;"> 75.7 </td>
936
- <td style="text-align:right;"> 52 </td>
937
- <td style="text-align:right;"> 4.93 </td>
938
- <td style="text-align:right;"> 1.615 </td>
939
- <td style="text-align:right;"> 18.52 </td>
940
- <td style="text-align:right;"> 1 </td>
941
- <td style="text-align:right;"> 1 </td>
942
- <td style="text-align:right;"> 4 </td>
943
- <td style="text-align:right;"> 2 </td>
944
- </tr>
945
- <tr>
946
- <td style="text-align:left;"> Toyota Corolla </td>
947
- <td style="text-align:right;"> 33.9 </td>
948
- <td style="text-align:right;"> 4 </td>
949
- <td style="text-align:right;"> 71.1 </td>
950
- <td style="text-align:right;"> 65 </td>
951
- <td style="text-align:right;"> 4.22 </td>
952
- <td style="text-align:right;"> 1.835 </td>
953
- <td style="text-align:right;"> 19.90 </td>
954
- <td style="text-align:right;"> 1 </td>
955
- <td style="text-align:right;"> 1 </td>
956
- <td style="text-align:right;"> 4 </td>
957
- <td style="text-align:right;"> 1 </td>
958
- </tr>
959
- <tr>
960
- <td style="text-align:left;"> Toyota Corona </td>
961
- <td style="text-align:right;"> 21.5 </td>
962
- <td style="text-align:right;"> 4 </td>
963
- <td style="text-align:right;"> 120.1 </td>
964
- <td style="text-align:right;"> 97 </td>
965
- <td style="text-align:right;"> 3.70 </td>
966
- <td style="text-align:right;"> 2.465 </td>
967
- <td style="text-align:right;"> 20.01 </td>
968
- <td style="text-align:right;"> 1 </td>
969
- <td style="text-align:right;"> 0 </td>
970
- <td style="text-align:right;"> 3 </td>
971
- <td style="text-align:right;"> 1 </td>
972
- </tr>
973
- <tr>
974
- <td style="text-align:left;"> Dodge Challenger </td>
975
- <td style="text-align:right;"> 15.5 </td>
976
- <td style="text-align:right;"> 8 </td>
977
- <td style="text-align:right;"> 318.0 </td>
978
- <td style="text-align:right;"> 150 </td>
979
- <td style="text-align:right;"> 2.76 </td>
980
- <td style="text-align:right;"> 3.520 </td>
981
- <td style="text-align:right;"> 16.87 </td>
982
- <td style="text-align:right;"> 0 </td>
983
- <td style="text-align:right;"> 0 </td>
984
- <td style="text-align:right;"> 3 </td>
985
- <td style="text-align:right;"> 2 </td>
986
- </tr>
987
- <tr>
988
- <td style="text-align:left;"> AMC Javelin </td>
989
- <td style="text-align:right;"> 15.2 </td>
990
- <td style="text-align:right;"> 8 </td>
991
- <td style="text-align:right;"> 304.0 </td>
992
- <td style="text-align:right;"> 150 </td>
993
- <td style="text-align:right;"> 3.15 </td>
994
- <td style="text-align:right;"> 3.435 </td>
995
- <td style="text-align:right;"> 17.30 </td>
996
- <td style="text-align:right;"> 0 </td>
997
- <td style="text-align:right;"> 0 </td>
998
- <td style="text-align:right;"> 3 </td>
999
- <td style="text-align:right;"> 2 </td>
1000
- </tr>
1001
- <tr>
1002
- <td style="text-align:left;"> Camaro Z28 </td>
1003
- <td style="text-align:right;"> 13.3 </td>
1004
- <td style="text-align:right;"> 8 </td>
1005
- <td style="text-align:right;"> 350.0 </td>
1006
- <td style="text-align:right;"> 245 </td>
1007
- <td style="text-align:right;"> 3.73 </td>
1008
- <td style="text-align:right;"> 3.840 </td>
1009
- <td style="text-align:right;"> 15.41 </td>
1010
- <td style="text-align:right;"> 0 </td>
1011
- <td style="text-align:right;"> 0 </td>
1012
- <td style="text-align:right;"> 3 </td>
1013
- <td style="text-align:right;"> 4 </td>
1014
- </tr>
1015
- <tr>
1016
- <td style="text-align:left;"> Pontiac Firebird </td>
1017
- <td style="text-align:right;"> 19.2 </td>
1018
- <td style="text-align:right;"> 8 </td>
1019
- <td style="text-align:right;"> 400.0 </td>
1020
- <td style="text-align:right;"> 175 </td>
1021
- <td style="text-align:right;"> 3.08 </td>
1022
- <td style="text-align:right;"> 3.845 </td>
1023
- <td style="text-align:right;"> 17.05 </td>
1024
- <td style="text-align:right;"> 0 </td>
1025
- <td style="text-align:right;"> 0 </td>
1026
- <td style="text-align:right;"> 3 </td>
1027
- <td style="text-align:right;"> 2 </td>
1028
- </tr>
1029
- <tr>
1030
- <td style="text-align:left;"> Fiat X1-9 </td>
1031
- <td style="text-align:right;"> 27.3 </td>
1032
- <td style="text-align:right;"> 4 </td>
1033
- <td style="text-align:right;"> 79.0 </td>
1034
- <td style="text-align:right;"> 66 </td>
1035
- <td style="text-align:right;"> 4.08 </td>
1036
- <td style="text-align:right;"> 1.935 </td>
1037
- <td style="text-align:right;"> 18.90 </td>
1038
- <td style="text-align:right;"> 1 </td>
1039
- <td style="text-align:right;"> 1 </td>
1040
- <td style="text-align:right;"> 4 </td>
1041
- <td style="text-align:right;"> 1 </td>
1042
- </tr>
1043
- <tr>
1044
- <td style="text-align:left;"> Porsche 914-2 </td>
1045
- <td style="text-align:right;"> 26.0 </td>
1046
- <td style="text-align:right;"> 4 </td>
1047
- <td style="text-align:right;"> 120.3 </td>
1048
- <td style="text-align:right;"> 91 </td>
1049
- <td style="text-align:right;"> 4.43 </td>
1050
- <td style="text-align:right;"> 2.140 </td>
1051
- <td style="text-align:right;"> 16.70 </td>
1052
- <td style="text-align:right;"> 0 </td>
1053
- <td style="text-align:right;"> 1 </td>
1054
- <td style="text-align:right;"> 5 </td>
1055
- <td style="text-align:right;"> 2 </td>
1056
- </tr>
1057
- <tr>
1058
- <td style="text-align:left;"> Lotus Europa </td>
1059
- <td style="text-align:right;"> 30.4 </td>
1060
- <td style="text-align:right;"> 4 </td>
1061
- <td style="text-align:right;"> 95.1 </td>
1062
- <td style="text-align:right;"> 113 </td>
1063
- <td style="text-align:right;"> 3.77 </td>
1064
- <td style="text-align:right;"> 1.513 </td>
1065
- <td style="text-align:right;"> 16.90 </td>
1066
- <td style="text-align:right;"> 1 </td>
1067
- <td style="text-align:right;"> 1 </td>
1068
- <td style="text-align:right;"> 5 </td>
1069
- <td style="text-align:right;"> 2 </td>
1070
- </tr>
1071
- <tr>
1072
- <td style="text-align:left;"> Ford Pantera L </td>
1073
- <td style="text-align:right;"> 15.8 </td>
1074
- <td style="text-align:right;"> 8 </td>
1075
- <td style="text-align:right;"> 351.0 </td>
1076
- <td style="text-align:right;"> 264 </td>
1077
- <td style="text-align:right;"> 4.22 </td>
1078
- <td style="text-align:right;"> 3.170 </td>
1079
- <td style="text-align:right;"> 14.50 </td>
1080
- <td style="text-align:right;"> 0 </td>
1081
- <td style="text-align:right;"> 1 </td>
1082
- <td style="text-align:right;"> 5 </td>
1083
- <td style="text-align:right;"> 4 </td>
1084
- </tr>
1085
- <tr>
1086
- <td style="text-align:left;"> Ferrari Dino </td>
1087
- <td style="text-align:right;"> 19.7 </td>
1088
- <td style="text-align:right;"> 6 </td>
1089
- <td style="text-align:right;"> 145.0 </td>
1090
- <td style="text-align:right;"> 175 </td>
1091
- <td style="text-align:right;"> 3.62 </td>
1092
- <td style="text-align:right;"> 2.770 </td>
1093
- <td style="text-align:right;"> 15.50 </td>
1094
- <td style="text-align:right;"> 0 </td>
1095
- <td style="text-align:right;"> 1 </td>
1096
- <td style="text-align:right;"> 5 </td>
1097
- <td style="text-align:right;"> 6 </td>
1098
- </tr>
1099
- <tr>
1100
- <td style="text-align:left;"> Maserati Bora </td>
1101
- <td style="text-align:right;"> 15.0 </td>
1102
- <td style="text-align:right;"> 8 </td>
1103
- <td style="text-align:right;"> 301.0 </td>
1104
- <td style="text-align:right;"> 335 </td>
1105
- <td style="text-align:right;"> 3.54 </td>
1106
- <td style="text-align:right;"> 3.570 </td>
1107
- <td style="text-align:right;"> 14.60 </td>
1108
- <td style="text-align:right;"> 0 </td>
1109
- <td style="text-align:right;"> 1 </td>
1110
- <td style="text-align:right;"> 5 </td>
1111
- <td style="text-align:right;"> 8 </td>
1112
- </tr>
1113
- <tr>
1114
- <td style="text-align:left;"> Volvo 142E </td>
1115
- <td style="text-align:right;"> 21.4 </td>
1116
- <td style="text-align:right;"> 4 </td>
1117
- <td style="text-align:right;"> 121.0 </td>
1118
- <td style="text-align:right;"> 109 </td>
1119
- <td style="text-align:right;"> 4.11 </td>
1120
- <td style="text-align:right;"> 2.780 </td>
1121
- <td style="text-align:right;"> 18.60 </td>
1122
- <td style="text-align:right;"> 1 </td>
1123
- <td style="text-align:right;"> 1 </td>
1124
- <td style="text-align:right;"> 4 </td>
1125
- <td style="text-align:right;"> 2 </td>
1126
- </tr>
1127
- </tbody>
1128
- </table>
1129
-
1130
- ### Including Ruby files in a chunk
1131
-
1132
- R is a language that was created to be easy and fast for statisticians to use. As far
1133
- as I know, it was not a
1134
- language to be used for developing large systems. Of course, there are large systems and
1135
- libraries in R, but the focus of the language is for developing statistical models and
1136
- distribute that to peers.
1137
-
1138
- Ruby on the other hand, is a language for large software development. Systems written in
1139
- Ruby will have dozens, hundreds or even thousands of files. To document a
1140
- large system with literate programming, we cannot expect the developer to add all the
1141
- files in a single '.Rmd' file. gKnit provides the 'include' chunk engine to include
1142
- a Ruby file as if it had being typed in the '.Rmd' file.
1143
-
1144
- To include a file, the following chunk should be created, where <filename> is the name of
1145
- the file to be included and where the extension, if it is '.rb', does not need to be added.
1146
- If the 'relative' option is not included, then it is treated as TRUE. When 'relative' is
1147
- true, ruby's 'require\_relative' semantics is used to load the file, when false, Ruby's
1148
- \$LOAD_PATH is searched to find the file and it is 'require'd.
1149
-
1150
- ````
1151
- ```{include <filename>, relative = <TRUE/FALSE>}
1152
- ```
1153
- ````
1154
-
1155
- Bellow we include file 'model.rb', which is in the same directory of this blog.
1156
- This code uses R 'caret' package to split a dataset in a train and test sets.
1157
- The 'caret' package is a very important a useful package for doing Data Analysis,
1158
- it has hundreds of functions for all steps of the Data Analysis workflow. To
1159
- use 'caret' just to split a dataset is like using the proverbial cannon to
1160
- kill the fly. We use it here only to show that integrating Ruby and R and
1161
- using even a very complex package as 'caret' is trivial with Galaaz.
1162
-
1163
- A word of advice: the 'caret' package has lots of dependencies and installing
1164
- it in a Linux system is a time consuming operation. Method 'R.install_and_loads'
1165
- will install the package if it is not already installed and can take a while.
1166
-
1167
- ````
1168
- ```{include model}
1169
- ```
1170
- ````
1171
-
1172
-
1173
- ```include
1174
- require 'galaaz'
1175
-
1176
- # Loads the R 'caret' package. If not present, installs it
1177
- R.install_and_loads 'caret'
1178
-
1179
- class Model
1180
-
1181
- attr_reader :data
1182
- attr_reader :test
1183
- attr_reader :train
1184
-
1185
- #==========================================================
1186
- #
1187
- #==========================================================
1188
-
1189
- def initialize(data, percent_train:, seed: 123)
1190
-
1191
- R.set__seed(seed)
1192
- @data = data
1193
- @percent_train = percent_train
1194
- @seed = seed
1195
-
1196
- end
1197
-
1198
- #==========================================================
1199
- #
1200
- #==========================================================
1201
-
1202
- def partition(field)
1203
-
1204
- train_index =
1205
- R.createDataPartition(@data.send(field), p: @percet_train,
1206
- list: false, times: 1)
1207
- @train = @data[train_index, :all]
1208
- @test = @data[-train_index, :all]
1209
-
1210
- end
1211
-
1212
- end
1213
-
1214
- ```
1215
-
1216
-
1217
- ```ruby
1218
- mtcars = ~:mtcars
1219
- model = Model.new(mtcars, percent_train: 0.8)
1220
- model.partition(:mpg)
1221
- puts model.train.head
1222
- puts model.test.head
1223
- ```
1224
-
1225
- ```
1226
- ## mpg cyl disp hp drat wt qsec vs am gear carb
1227
- ## Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
1228
- ## Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
1229
- ## Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
1230
- ## Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
1231
- ## Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
1232
- ## Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
1233
- ## mpg cyl disp hp drat wt qsec vs am gear carb
1234
- ## Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
1235
- ## Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
1236
- ## Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
1237
- ## Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
1238
- ## Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
1239
- ## Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
1240
- ```
1241
-
1242
- ### Documenting Gems
1243
-
1244
- gKnit also allows developers to document and load files that are not in the same directory
1245
- of the '.Rmd' file.
1246
-
1247
- Here is an example of loading the 'find.rb' file from TruffleRuby. In this example, relative
1248
- is set to FALSE, so Ruby will look for the file in its $LOAD\_PATH, and the user does not
1249
- need to no it's directory.
1250
-
1251
- ````
1252
- ```{include find, relative = FALSE}
1253
- ```
1254
- ````
1255
-
1256
-
1257
- ```include
1258
- # frozen_string_literal: true
1259
- #
1260
- # find.rb: the Find module for processing all files under a given directory.
1261
- #
1262
-
1263
- #
1264
- # The +Find+ module supports the top-down traversal of a set of file paths.
1265
- #
1266
- # For example, to total the size of all files under your home directory,
1267
- # ignoring anything in a "dot" directory (e.g. $HOME/.ssh):
1268
- #
1269
- # require 'find'
1270
- #
1271
- # total_size = 0
1272
- #
1273
- # Find.find(ENV["HOME"]) do |path|
1274
- # if FileTest.directory?(path)
1275
- # if File.basename(path)[0] == ?.
1276
- # Find.prune # Don't look any further into this directory.
1277
- # else
1278
- # next
1279
- # end
1280
- # else
1281
- # total_size += FileTest.size(path)
1282
- # end
1283
- # end
1284
- #
1285
- module Find
1286
-
1287
- #
1288
- # Calls the associated block with the name of every file and directory listed
1289
- # as arguments, then recursively on their subdirectories, and so on.
1290
- #
1291
- # Returns an enumerator if no block is given.
1292
- #
1293
- # See the +Find+ module documentation for an example.
1294
- #
1295
- def find(*paths, ignore_error: true) # :yield: path
1296
- block_given? or return enum_for(__method__, *paths, ignore_error: ignore_error)
1297
-
1298
- fs_encoding = Encoding.find("filesystem")
1299
-
1300
- paths.collect!{|d| raise Errno::ENOENT, d unless File.exist?(d); d.dup}.each do |path|
1301
- path = path.to_path if path.respond_to? :to_path
1302
- enc = path.encoding == Encoding::US_ASCII ? fs_encoding : path.encoding
1303
- ps = [path]
1304
- while file = ps.shift
1305
- catch(:prune) do
1306
- yield file.dup.taint
1307
- begin
1308
- s = File.lstat(file)
1309
- rescue Errno::ENOENT, Errno::EACCES, Errno::ENOTDIR, Errno::ELOOP, Errno::ENAMETOOLONG
1310
- raise unless ignore_error
1311
- next
1312
- end
1313
- if s.directory? then
1314
- begin
1315
- fs = Dir.children(file, encoding: enc)
1316
- rescue Errno::ENOENT, Errno::EACCES, Errno::ENOTDIR, Errno::ELOOP, Errno::ENAMETOOLONG
1317
- raise unless ignore_error
1318
- next
1319
- end
1320
- fs.sort!
1321
- fs.reverse_each {|f|
1322
- f = File.join(file, f)
1323
- ps.unshift f.untaint
1324
- }
1325
- end
1326
- end
1327
- end
1328
- end
1329
- nil
1330
- end
1331
-
1332
- #
1333
- # Skips the current file or directory, restarting the loop with the next
1334
- # entry. If the current file is a directory, that directory will not be
1335
- # recursively entered. Meaningful only within the block associated with
1336
- # Find::find.
1337
- #
1338
- # See the +Find+ module documentation for an example.
1339
- #
1340
- def prune
1341
- throw :prune
1342
- end
1343
-
1344
- module_function :find, :prune
1345
- end
1346
- ```
1347
-
1348
- ## Converting to PDF
1349
-
1350
- One of the beauties of knitr is that the same input can be converted to many different outputs.
1351
- One very useful format, is, of course, PDF. In order to converted an __R markdown__ file to PDF
1352
- it is necessary to have LaTeX installed on the system. We will not explain here how to
1353
- install LaTeX as there are plenty of documents on the web showing how to proceed.
1354
-
1355
- gKnit comes with a simple LaTeX style file for gknitting this blog as a PDF document. Here is
1356
- the Yaml header to generate this blog in PDF format instead of HTML:
1357
-
1358
- ```
1359
- ---
1360
- title: "gKnit - Ruby and R Knitting with Galaaz in GraalVM"
1361
- author: "Rodrigo Botafogo"
1362
- tags: [Galaaz, Ruby, R, TruffleRuby, FastR, GraalVM, knitr, gknit]
1363
- date: "29 October 2018"
1364
- output:
1365
- pdf\_document:
1366
- includes:
1367
- in\_header: ["../../sty/galaaz.sty"]
1368
- number\_sections: yes
1369
- ---
1370
- ```
1371
-
1372
-
1373
- # Conclusion
1374
-
1375
- In order to do reproducible research, one of the main basic tools needed is a systhem that
1376
- allows "literate programming" where text, code and possibly a set of files can be compiled
1377
- onto a report that can be easily distributed to peers. Peers should be able to use this
1378
- same set of files to rerun the compilation by their own obtaining the exact same original
1379
- report. gKnit is such a system for Ruby and R. It uses __R Markdown__ to integrate
1380
- text and code chunks, where code chunks can either be part of the __R Markdwon__ file or
1381
- be imported from files in the system. Ideally, in reproducible research, all the files
1382
- needed to rebuild a report should be easilly packed together (in the same zipped directory)
1383
- and distributed to peers for reexecution.
1384
-
1385
- One of the promises of Oracle's GraalVM is that users/developers will be able to use the best tool
1386
- for their task at hand, independently of the programming language the tool was written on.
1387
- We developed and implemented Galaaz atop the GraalVM and Truffle interop messages and
1388
- the time and effort to wrap Ruby over R - Galaaz - or to
1389
- wrap Knitr with gKnit was a fraction of a fraction of a fraction (one man effort for a couple
1390
- of hours a day, for approximately six months) of the time require to
1391
- implement the original tools. Trying to reimplement all R packages in Ruby would require the
1392
- same effort it is taking Python to implement NumPy, Pandas and all supporting libraries and it
1393
- is unlikely that this effort would ever be done. GraalVM has allowed Ruby to profit "almost
1394
- for free" from this huge set of libraries and tools that make R one of the most used
1395
- languages for data analysis and machine learning.
1396
-
1397
- More interesting than wrapping the R libraries with Ruby, is that Ruby adds
1398
- value to R, by allowing developers to use powerful and modern constructs for code reuse that
1399
- are not the strong points of R. As shown in this blog, R and Ruby can easily communicate
1400
- and R can be structured in classes and modules in a way that greatly expands its power and
1401
- readability.
1402
-
1403
- # Installing gKnit
1404
-
1405
- ## Prerequisites
1406
-
1407
- * GraalVM (>= rc8)
1408
- * TruffleRuby
1409
- * FastR
1410
-
1411
- The following R packages will be automatically installed when necessary, but could be installed prior
1412
- to using gKnit if desired:
1413
-
1414
- * ggplot2
1415
- * gridExtra
1416
- * knitr
1417
-
1418
- Installation of R packages requires a development environment and can be time consuming. In Linux,
1419
- the gnu compiler and tools should be enough. I am not sure what is needed on the Mac.
1420
-
1421
- ## Preparation
1422
-
1423
- * gem install galaaz
1424
-
1425
- ## Usage
1426
-
1427
- * gknit \<filename\>
1428
-
1429
- # References
1430
-