red_amber 0.5.0 → 0.5.2

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,292 @@
1
+ # How to use Development Containers in RedAmber
2
+
3
+ We support [Development Container](https://containers.dev/) in this repository.
4
+ You can prepare a container as a full-featured development environment for RedAmber. Dev Containers allow you to encapsulate Ruby, Apache Arrow, RedAmber with source tree, GitHub CLI, sample datasets and Jupyter Lab with IRuby kernel. You don't need to worry about the change of your local environment.
5
+
6
+ `.devcontainer` directory in this repository includes settings of Dev Container for RedAmber. We don't use Dockerfile here, based on Ubuntu image for Dev Container, Python and GitHub CLI tools using Dev Container Features. I think this style has simplicity, maintainability, and reusability. Ruby is added after the container is created by script.
7
+
8
+ It has some benefits below compared to make dev environment by Dockerfile;
9
+
10
+ 1) It automatically makes user setting with same UID/GID as local user.
11
+ 2) Additional tools can be introduced by `Dev Container Features`.
12
+ 3) Ruby Feature includes `rbenv` and it is easy to add another version afterwards.
13
+ 4) Python Feature includes Jupyter Lab as an option.
14
+ 5) Quarto is introduced. It converts Jupyter notebook from/to qmd file and it is useful to manage notebook in the source tree.
15
+
16
+ We will show 2 examples here.
17
+
18
+ ## 1. GitHub Codespace in cloud from browser
19
+
20
+ ### Prerequisite
21
+
22
+ You need to sign in GitHub Account.
23
+
24
+ ### Notice
25
+
26
+ You will consume your Codespace quota of your account in the step below. For GitHub Free, you can use 120 hours per month per core (it means 60 hours per month for 2 cores VM) and have a storage with 15 GB per month for free. You can check your usage from `Codespaces` section in [Billing and plans](https://github.com/settings/billing).
27
+
28
+ ### Procedures
29
+ - Open the [repository of RedAmber](https://github.com/red-data-tools/red_amber) in GitHub.
30
+ - If you are going to develop RedAmber, you should fork it and open your forked repository to push PR later.
31
+
32
+ - Push `<>Code` button, select `Codespaces` tab, push `Create codespace on main` button to create new Codespace.
33
+ * You can re-connect existing Codespace if you already have there.
34
+ - Creating Codespace takes time. You can click `View log` to browse log running or have a coffee to wait.
35
+ * I am planning to improve building process to save cache in GitHub Container Registory.
36
+
37
+ - VS Code for browser will open the repository in the remote container.
38
+
39
+ Please refer [Operations](#operations) to use the environment.
40
+
41
+ ### Details
42
+ Please see [(GitHub Docs)Creating a codespace for a repository](https://docs.github.com/en/codespaces/developing-in-codespaces/creating-a-codespace-for-a-repository) for detail.
43
+
44
+ ## 2. Start Dev Container from the local repository, and use it from VS Code
45
+
46
+ ### Prerequisites
47
+ - Visual Studio Code (October 2020 Release 1.51+)
48
+
49
+ You need to install GitHub Codespaces extention, and sign into GitHub Codespaces with your GitHub credentials. Please see [GitHub Docs - GitHub Codespaces - Prerequisites](https://docs.github.com/en/codespaces/developing-in-codespaces/using-github-codespaces-in-visual-studio-code#prerequisites) and prepare settings.
50
+
51
+ - Docker
52
+ - Windows
53
+
54
+ In Windows 10 Pro/Enterprise, Docker Desktop 2.0+
55
+ In Windows 10 Home (2004+), Docker Desktop 2.3+ and WSL 2 backend.
56
+
57
+ - Mac
58
+
59
+ Docker Desktop 2.0+
60
+
61
+ - Linux
62
+
63
+ Docker CE/EE 18.06+ and Docker Compose 1.21+
64
+
65
+ - Git
66
+
67
+ ### Procedures
68
+
69
+ - Create a local clone of RedAmber repository.
70
+
71
+ - If you are going to develop RedAmber, make a fork and clone it.
72
+
73
+ ```
74
+ $ git clone https://github.com/(red-data-tools or your account name)/red_amber.git
75
+ ```
76
+
77
+ Alternatively using GitHub CLI,
78
+
79
+ ```
80
+ $ gh repo clone (red-data-tools or your account name)/red_amber
81
+ ```
82
+
83
+ - Open local repo folder by VS Code.
84
+
85
+ ```
86
+ $ code red_amber
87
+ ```
88
+
89
+ - Re-open by container
90
+
91
+ Re-open current folder by container.
92
+
93
+ - Click remote host indicator in the left bottom corner, then options to open remote windows will open. Choose 'reopen by container'.
94
+
95
+ - Building of container will start.
96
+
97
+ It takes time for first building. If it is finished, container name will be displayed on the remote host indicator.
98
+
99
+ ## Operations
100
+
101
+ ### Check installed tools by terminal
102
+
103
+ If you don't have the terminal open, open it with ``CTRL + ` ``.
104
+
105
+ Run these command to check these tools are installed.
106
+
107
+ ```shell
108
+ $ ruby -v --jit
109
+ $ rbenv versions
110
+ $ gem -v
111
+ $ gem list
112
+ $ bundler -v
113
+ $ iruby -v
114
+
115
+ $ python --version
116
+ $ pip --version
117
+ $ pip list
118
+ $ pipenv --version
119
+ $ jupyter --version
120
+ $ jupyter kenelspec list
121
+
122
+ $ git -v
123
+ $ git config user.name
124
+ $ gh --version
125
+ ```
126
+
127
+ The user name is `vscode` in this environment. `uid` and `gid` are the same as local user.
128
+
129
+ ```shell
130
+ $ id
131
+ ```
132
+
133
+ ### Run tests of RedAmber
134
+
135
+ ```shell
136
+ $ bundle exec rake
137
+ ```
138
+
139
+ ### Try RedAmber in REPL
140
+
141
+ You can try RedAmber in `irb` using pre loaded datasets. It takes time in the first run to load the datasets from Red Datasets.
142
+
143
+ ```ruby
144
+ $ rake example
145
+
146
+ (snip)
147
+
148
+ 81: # Welcome to RedAmber example!
149
+ 82: # This environment will offer these pre-loaded datasets:
150
+ 83: # penguins, diamonds, iris, starwars, simpsons_paradox_covid,
151
+ 84: # mtcars, band_members, band_instruments, band_instruments2
152
+ 85: # import_cars, comecome, rubykaigi, dataframe, subframes
153
+ => 86: binding.irb
154
+
155
+ irb(main):001:0>
156
+ ```
157
+
158
+ This code stops in the code by `binding.irb`, you have some datasets in local variables.
159
+
160
+ ```ruby
161
+ irb(main):001:0> import_cars
162
+ =>
163
+ #<RedAmber::DataFrame : 5 x 6 Vectors, 0x0000000000010914>
164
+ Year Audi BMW BMW_MINI Mercedes-Benz VW
165
+ <int64> <int64> <int64> <int64> <int64> <int64>
166
+ 0 2017 28336 52527 25427 68221 49040
167
+ 1 2018 26473 50982 25984 67554 51961
168
+ 2 2019 24222 46814 23813 66553 46794
169
+ 3 2020 22304 35712 20196 57041 36576
170
+ 4 2021 22535 35905 18211 51722 35215
171
+ ```
172
+
173
+ The namespace `RedAmber` is included.
174
+
175
+ ```ruby
176
+ irb(main):002:0> VERSION
177
+ => "0.5.0"
178
+ irb(main):003:0> Arrow::VERSION
179
+ => "12.0.1"
180
+ ```
181
+
182
+ You can return to the first breakinng point by hitting `@`.
183
+
184
+ You can exit irb by `exit`.
185
+
186
+ ### Try RedAmber in Jupyter Lab
187
+
188
+ You can try Jupyter Lab with Python and IRuby kernels in your browser.
189
+
190
+ ```shell
191
+ $ rake jupyter
192
+ ```
193
+
194
+ - `doc/notebook` is allocated as notebook folder. There are 2 files in it.
195
+ - `red_amber.ipynb` : Examples in `README.md`.
196
+ - `examples_of_red_amber.ipynb` : Hundreds examples of RedAmber.
197
+ - `require 'red_amber'` will load from source directory `lib`.
198
+
199
+ ## Document authoring by Quarto
200
+
201
+ [Quarto](https://quarto.org/) is an open-source scientific and technical publishing system.
202
+ We use Quarto CLI to show usage examples of RedAmber.
203
+
204
+ ```mermaid
205
+ ---
206
+ title: Document management with Quarto
207
+ ---
208
+ flowchart LR
209
+ id1["Source management
210
+ (.qmd)"]
211
+ id2["Analyze and edit by JupyterLab
212
+ (.ipynb)"]
213
+ id3["Publish document
214
+ (.pdf)"]
215
+
216
+ id1 -- convert --> id2 -- convert --> id1
217
+ id2 -- render --> id3
218
+ id1 -- render --> id3
219
+ ```
220
+
221
+ * We can manage the source of the Jupyter notebook by Quarto's markdown format `qmd`.
222
+ * We can convert a `.qmd` file to a Jupyter notebook file (`.ipynb`) and will be able to edit it or make analysis on Jupyter Lab.
223
+ * We can render `.qmd` or `.ipynb` files to `.pdf`.
224
+
225
+ ### To show the information of Quarto
226
+
227
+ Try below to show version and verify correct functioning of Quarto installation.
228
+
229
+ ```shell
230
+ $ quarto -v
231
+ $ quarto check
232
+ ```
233
+
234
+ To show help,
235
+
236
+ ```shell
237
+ $ quarto --help
238
+ $ quarto render --help
239
+ ```
240
+
241
+ ### Convert qmd file to Jupyter Notebook
242
+
243
+ To convert `.qmd` source file to `.ipynb`,
244
+
245
+ ```shell
246
+ $ bundle exec rake quarto:convert
247
+ ```
248
+
249
+ This command will create `ipynb` notebooks from `doc/qmd` and save them to `doc/notebook`.
250
+
251
+ In more general,
252
+
253
+ ```shell
254
+ $ quarto convert source_file.qmd
255
+ $ quarto convert source_file.qmd --output Notebook.ipynb
256
+ ```
257
+
258
+ The first one will output `source_file.ipynb` file in the same directory.
259
+
260
+ Command below will convert qmd_document files in `doc/qmd` to `.ipynb` files and save them to `doc/notebook` . Then open `doc/notebook` with Jupyter Lab.
261
+
262
+ ```shell
263
+ $ bin/jupyter
264
+ ```
265
+
266
+ ### Convert Jupyter Notebook to `qmd`
267
+
268
+ You can convert Notebook file to qmd file.
269
+
270
+ ```shell
271
+ $ quarto convert notebook.ipynb
272
+ $ quarto convert notebook.ipynb --output output_source_file.qmd
273
+ ```
274
+
275
+ ### Others
276
+
277
+ To render Notebook files in `doc/qmd` to pdf,
278
+
279
+ ```shell
280
+ $ bundle exec rake quarto:test
281
+ ```
282
+ To clear `doc/notebook` (and all the generated artifacts by rake),
283
+
284
+ ```shell
285
+ $ rake clean
286
+ ```
287
+
288
+ To know more about Quarto, see command line help by `quarto --help`, or visit [Quarto](https://quarto.org/) web.
289
+
290
+ ### Thanks
291
+
292
+ As for the use of Quarto, I started to try after the Kozo Nishida's work with Ruby Association Grant 2022 "Introducing Quarto into the RubyData ecosystem and promoting the combination to the Ruby community". I would like to take this opportunity to thank him.