rabbit-slide-hasumikin-RubyKaigi2024 2024.05.16.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,1103 @@
1
+ # Unlock The Universal Parsers
2
+
3
+ subtitle
4
+ : A New PicoRuby Compiler
5
+
6
+ author
7
+ : @hasumikin
8
+
9
+ institution
10
+ : Monstarlab
11
+
12
+ allotted-time
13
+ : 30m
14
+
15
+ theme
16
+ : rubykaigi-2024
17
+
18
+ # [AD] "n-monthly Lambda Note" Vol.4, No.1
19
+
20
+ γ€€
21
+ γ€€
22
+ **Learning How
23
+ Programming Languages
24
+ Control Electrical Circuits
25
+ with PicoRuby**
26
+
27
+ {::tag name="small"}(Japanese only){:/tag}
28
+
29
+ 1,760 yen {::tag name="x-small"}(tax included){:/tag}
30
+
31
+ ![](images/lambdanote.png){:
32
+ align="right"
33
+ width="550"
34
+ relative_margin_top="-2"
35
+ relative_margin_right="5"
36
+ }
37
+
38
+ # CRuby's parser
39
+
40
+ - parse.*y* and prism.*c*: human-writable
41
+ - parse.*c*: generated by Bison or Lrama
42
+ - prism.*c*: no generator needed
43
+
44
+ ```mermaid
45
+ graph LR;
46
+ parse_y(parse.y) -->|Bison| parse_c_1[parse.c #40;retired#41;]
47
+ parse_y(parse.y) -->|Lrama| parse_c_2[lrama-generated parse.c]
48
+ prism(prism.c)
49
+ ```
50
+ {:relative_height="100"}
51
+
52
+ # Prism and Lrama-generated parser
53
+
54
+ - Prism
55
+ - Newly developed universal Ruby parser
56
+ - Portable and small footprint
57
+ - Easy to integrate
58
+ - Available in form of a gem
59
+ - Lrama-generated parser (parse.c)
60
+ - 100% compliant with CRuby in any aspect
61
+ - Not portable by nature
62
+
63
+ # Parsers in Ruby (as of May 2024)
64
+
65
+ |Ruby|Parser|
66
+ |---------|--------|
67
+ |CRuby (default)|Lrama-generated|
68
+ |CRuby (experimental)|Prism|
69
+ |JRuby|Prism|
70
+ |TruffleRuby|Prism|
71
+ |*mruby*|*by Matz*|
72
+ |*PicoRuby*|*by hasumikin (me)*|
73
+ |IRB(katakata_irb)|(gem) Prism|
74
+ |RuboCop|(gem) Parser β†’ Prism|
75
+
76
+ # Prism in RuboCop
77
+
78
+ ![](images/koic.png){:width="1500" relative_margin_top="0"}
79
+
80
+ # Parsers in Ruby *(I'm working on)*
81
+
82
+ |Ruby|Parser|
83
+ |---------|--------|
84
+ |CRuby (default)|Lrama-generated|
85
+ |CRuby (optional)|Prism|
86
+ |JRuby|Prism|
87
+ |TruffleRuby|Prism|
88
+ |*mruby*|*???????*|
89
+ |*PicoRuby*|*???????*|
90
+ |IRB(katakata_irb)|(gem) Prism|
91
+ |RuboCop|(gem) Parser β†’ Prism|
92
+
93
+ # mruby compiler generates VM code
94
+
95
+ ```bash
96
+ $ echo 'puts "Hello, World!"' | bin/mrbc - | xxd # hexdump
97
+ 00000000: 5249 5445 3033 3030 0000 005e 4d41 545a RITE0300...^MATZ
98
+ 00000010: 3030 3030 4952 4550 0000 0042 3033 3030 0000IREP...B0300
99
+ 00000020: 0000 0036 0000 0004 0000 0000 0000 000a ...6............
100
+ 00000030: 5102 002d 0100 0138 0169 0001 0000 0d48 Q..-...8.i.....H
101
+ 00000040: 656c 6c6f 2c20 576f 726c 6421 0000 0100 ello, World!....
102
+ 00000050: 0470 7574 7300 454e 4400 0000 0008 .puts.END.....
103
+
104
+ $ echo 'puts "Hello, World!"' | bin/mrbc - | bin/mruby
105
+ Hello, World!
106
+ ```
107
+
108
+ {:.center}
109
+ (PicoRuby compiler does the same)
110
+
111
+ # PicoRuby compiler must be small footprint
112
+
113
+ - Raspberry Pi Pico
114
+ - 133 MHz Arm Cortex-M0+ (dual)
115
+ - *264 KB SRAM*
116
+ - 2 MB Flash ROM
117
+
118
+ ![](images/rpi_pico.jpg){:align="right" height="800"}
119
+
120
+ # PicoRuby compiler must be small footprint
121
+
122
+ - Raspberry Pi Pico
123
+ - 133 MHz Arm Cortex-M0+ (dual)
124
+ - *264 KB SRAM*
125
+ - 2 MB Flash ROM
126
+
127
+ {:.center}
128
+ γ€€
129
+ {::tag name="x-large"}*If a universal parser is not small enough, it won't be "universal"*{:/tag}
130
+
131
+ ![](images/rpi_pico.jpg){:align="right" height="800"}
132
+
133
+ # PicoRuby compiler must be small footprint
134
+
135
+ ```bash
136
+ $ valgrind \
137
+ --tool=massif \
138
+ --stacks=yes \
139
+ path/to/mrbc hello_world.rb
140
+ #=> massif.out.12345
141
+
142
+ $ ms_print massif.out.12345 | less
143
+ ```
144
+
145
+ # The original mruby Compiler
146
+
147
+ ```bash
148
+ --------------------------------------------------------------------------------
149
+ Command: bin/mrbc ../mruby-compiler2/fixtures/hello_world.rb
150
+ Massif arguments: --stacks=yes
151
+ ms_print arguments: massif.out.29290
152
+ --------------------------------------------------------------------------------
153
+ KB
154
+ 129.5^ :
155
+ | #
156
+ | #
157
+ | #
158
+ | #
159
+ | #
160
+ | @#
161
+ | @#
162
+ | @#
163
+ | @#
164
+ | ::@@@@@:::@@:::::::@@:@::@:@#:
165
+ | @@@::::::::::::::::::@:@::: @@ @ :::@ ::: :::@@:@::@:@#:
166
+ | @ : :::: ::::: :::: @:@::: @@ @ :::@ ::: :::@@:@::@:@#:
167
+ | @ : :::: ::::: :::: @:@::: @@ @ :::@ ::: :::@@:@::@:@#:
168
+ | @ : :::: ::::: :::: @:@::: @@ @ :::@ ::: :::@@:@::@:@#:
169
+ | @ : :::: ::::: :::: @:@::: @@ @ :::@ ::: :::@@:@::@:@#:
170
+ | @ : :::: ::::: :::: @:@::: @@ @ :::@ ::: :::@@:@::@:@#:
171
+ | @ : :::: ::::: :::: @:@::: @@ @ :::@ ::: :::@@:@::@:@#:
172
+ | @ : :::: ::::: :::: @:@::: @@ @ :::@ ::: :::@@:@::@:@#:
173
+ | @@ @ : :::: ::::: :::: @:@::: @@ @ :::@ ::: :::@@:@::@:@#:
174
+ 0 +----------------------------------------------------------------------->Mi
175
+ 0 1.030
176
+ ```
177
+
178
+ # Existing PicoRuby Compiler
179
+
180
+ ```bash
181
+ --------------------------------------------------------------------------------
182
+ Command: bin/picorbc ../mruby-compiler2/fixtures/hello_world.rb
183
+ Massif arguments: --stacks=yes
184
+ ms_print arguments: massif.out.28004
185
+ --------------------------------------------------------------------------------
186
+ KB
187
+ 16.70^ :
188
+ | #::::::::::@
189
+ | #::::::::::@:
190
+ | #::::::::::@:
191
+ | #::::::::::@:
192
+ | #::::::::::@:
193
+ | #::::::::::@::
194
+ | #::::::::::@::
195
+ | #::::::::::@::
196
+ | #::::::::::@::
197
+ | #::::::::::@::
198
+ | #::::::::::@::
199
+ | #::::::::::@::
200
+ | :: #::::::::::@::
201
+ | :: : #::::::::::@::
202
+ | :: : #::::::::::@::
203
+ | :: : #::::::::::@::
204
+ | :: : #::::::::::@::
205
+ | ::::: : :::::: @@ :::: : #::::::::::@::
206
+ | :::: ::::::::::::::::::::::: :@ ::::::::@::::@#::::::::::@::
207
+ 0 +----------------------------------------------------------------------->ki
208
+ 0 219.6
209
+ ```
210
+
211
+ # mrbc_prism
212
+
213
+ ```bash
214
+ --------------------------------------------------------------------------------
215
+ Command: bin/mrbc_prism ../mruby-compiler2/fixtures/hello_world.rb
216
+ Massif arguments: --stacks=yes
217
+ ms_print arguments: massif.out.31071
218
+ --------------------------------------------------------------------------------
219
+ KB
220
+ 9.219^ #
221
+ | #
222
+ | #
223
+ | #
224
+ | @@#
225
+ | @ #
226
+ | :: @ #
227
+ | :: @ #
228
+ | :: : @ #:
229
+ | :: : @ #:
230
+ | :: : @ #:
231
+ | :: : @ #:
232
+ | :: : @ #:
233
+ | :: : @ #:
234
+ | :: : @ #:
235
+ | :::: : @ #:
236
+ | :::: : ::::: ::: ::@ #:
237
+ | @:::::: :::::::: :::: ::@:::::: ::: :::: ::::::@ #::
238
+ | :@::::::::::::::::::::: :::::: @: : @: ::::::: ::::::@ #::
239
+ | ::@::::::: ::::::::: :::::: @: : @: :: :::: ::::::@ #::
240
+ 0 +----------------------------------------------------------------------->ki
241
+ 0 201.0
242
+ ```
243
+
244
+ # RAM consumption
245
+
246
+ |Compiler|RAM consumption|
247
+ |---------|------------------|
248
+ |mrbc (original)|129.5 KB|
249
+ |picorbc|16.70 KB|
250
+ |mrbc_prism|9.219 KB*\**|
251
+
252
+ {:.right}
253
+ {::tag name="xx-small"}Figures are measured in 64 bit Linux{:/tag}
254
+ γ€€
255
+ {:.left}
256
+ {::tag name="xx-small"}*\** mrbc_prism's small footprint is also due to a new code generator and possibly going to be bigger with development{:/tag}
257
+
258
+ # mrbc_prism
259
+
260
+ ```c
261
+ mrc_irep *
262
+ compile_ruby_to_mruby_vm_code_with_prism(mrc_ccontext *c)
263
+ {
264
+ pm_parser_t parser;
265
+ uint8_t *string = "puts 'Hello, World!'"
266
+ pm_parser_init(&parser,
267
+ string, strlen((const char *)string),
268
+ NULL);
269
+ pm_node_t *root = pm_parse(&parser);
270
+ mrc_irep *irep = mrc_load_exec(c, (mrc_node *)root);
271
+ pm_node_destroy(&parser, root);
272
+ pm_parser_free(&parser);
273
+ return irep;
274
+ }
275
+ ```
276
+
277
+ {:.center}
278
+ {::tag name="xx-small"}(For illustrative purposes only and non-functional){:/tag}
279
+
280
+ # Prism, really good stuff
281
+
282
+ - Well designed APIs and NODE
283
+ - Portable
284
+ - Small footprint
285
+
286
+ {:.center}
287
+ γ€€
288
+ {::tag name="x-large"}*Still, I'm going to implement Lrama-generated parser.*{:/tag}
289
+ {::tag name="x-large"}*Why?*{:/tag}
290
+
291
+ # In Taipei, Taiwan last year...
292
+
293
+ ![](images/taiwan.jpg){:width="1200"
294
+ draw0="[text, οΌ spikeolaf, 0.34, 0.4, {color: white, size: 40, font_family: 'Poppins', weight: bold\}]"
295
+ draw1="[text, οΌ koic, 0.82, 0.3, {color: white, size: 40, font_family: 'Poppins', weight: bold\}]"
296
+ draw2="[text, me, 0.05, 0.23, {color: white, size: 40, font_family: 'Poppins', weight: bold\}]"
297
+ }
298
+
299
+ # A Labyrinth
300
+
301
+ ![](images/labyrinth.jpg){:width="1980" height="1250" relative_margin_right="-10" relative_margin_top="-6" align="right"
302
+ draw0="[text, parse.y is a labyrinth, 0.11, 0.7, {color: green, size: 100, font_family: 'Dela Gothic One'\}]"
303
+ }
304
+
305
+ # @ydah_
306
+
307
+ ![](images/ydah_.png){:width="1500" }
308
+
309
+ # @junk0612
310
+
311
+ ![](images/junk0612.png){:width="1500" }
312
+
313
+ # no longer a Labyrinth
314
+
315
+ ![](images/nolongerlabyrinth.jpg){:width="1980" height="1250" relative_margin_right="-10" relative_margin_top="-6" align="right"
316
+ draw0="[text, No longer a labyrinth, 0.11, 0.7, {color: cyan, size: 100, font_family: 'Dela Gothic One'\}]"
317
+ }
318
+
319
+ # Why Lrama-generated parser?
320
+
321
+ - Making parse.y readable and maintainable
322
+ - Integrated error tolerance
323
+ - Innovative state management of the tokenizer
324
+ - In short, a new era of LALR parser generator
325
+
326
+ {:.center}
327
+ γ€€
328
+ {::tag name="x-large"}*Computer-science-wide achievement*{:/tag}
329
+
330
+ # hide-title
331
+
332
+ {:.center}
333
+ {::tag name="xx-large"}OK, let me use both{:/tag}
334
+ {::tag name="xx-large"}*Prism* and *Lrama-gen parser*{:/tag}
335
+ {::tag name="xx-large"}in my PicoRuby compilerπŸ’ͺ{:/tag}
336
+
337
+ ## prop
338
+
339
+ hide-title
340
+ : true
341
+
342
+ # Parsers in Ruby *(I'm working on)*
343
+
344
+ |Ruby|Parser|
345
+ |---------|--------|
346
+ |CRuby (default)|Lrama-generated|
347
+ |CRuby (optional)|Prism|
348
+ |JRuby|Prism|
349
+ |TruffleRuby|Prism|
350
+ |*mruby*|*Prism and Lrama-gen*|
351
+ |*PicoRuby*|*Prism and Lrama-gen*|
352
+ |IRB(katakata_irb)|(gem) Prism|
353
+ |RuboCop|(gem) Parser β†’ Prism|
354
+
355
+ # AST of Lrama-generated parser
356
+
357
+ ```bash
358
+ $ ruby --dump=p -e'puts "Hello, World!"'
359
+ ###########################################################
360
+ ## Do NOT use this node dump for any purpose other than ##
361
+ ## debug and research. Compatibility is not guaranteed. ##
362
+ ###########################################################
363
+
364
+ # @ NODE_SCOPE (id: 3, line: 1, location: (1,0)-(1,20))
365
+ # +- nd_tbl: (empty)
366
+ # +- nd_args:
367
+ # | (null node)
368
+ # +- nd_body:
369
+ # @ NODE_FCALL (id: 0, line: 1, location: (1,0)-(1,20))*
370
+ # +- nd_mid: :puts
371
+ # +- nd_args:
372
+ # @ NODE_LIST (id: 2, line: 1, location: (1,5)-(1,20))
373
+ # +- as.nd_alen: 1
374
+ # +- nd_head:
375
+ # | @ NODE_STR (id: 1, line: 1, location: (1,5)-(1,20))
376
+ # | +- nd_lit: "Hello, World!"
377
+ # +- nd_next:
378
+ # (null node)
379
+ ```
380
+
381
+ # hide-title
382
+
383
+ {::tag name="xx-large"}Version 0.0.0.0.1 of mrbc_lrama{:/tag}
384
+ {::tag name="xx-small"}(A mruby compiler using Lrama-generated parser){:/tag}
385
+
386
+ ## prop
387
+
388
+ hide-title
389
+ : true
390
+
391
+ # How to integrate Lrama-gen parser
392
+
393
+ ```bash
394
+ $ cd ruby/ruby
395
+ $ ./autogen.sh #=> configure
396
+ $ ./configure #=> common.mk etc.
397
+ $ make
398
+ $ ls -l *.a
399
+ -rw-r--r-- 1 145797150 libruby-static.a
400
+ ```
401
+
402
+ {:.center}
403
+ {::tag name="x-large"}An archive file containing all the object files of the Ruby interpreter{:/tag}
404
+
405
+ # libruby-static.a
406
+
407
+ ```mermaid
408
+ flowchart LR
409
+ parse.y -->|Lrama| parse.c --> parse.o
410
+ node.c --> node.o
411
+ string.c --> string.o
412
+ subgraph libruby-static.a
413
+ parse.o
414
+ node.o
415
+ string.o
416
+ etc.
417
+ end
418
+ libHOGE.so --> ruby
419
+ libruby-static.a ---> ruby
420
+ ```
421
+ {:relative_height="100"}
422
+
423
+ # Even CRuby can be embedded
424
+
425
+ ```c
426
+ // hello_world.c
427
+ #include <ruby.h>
428
+ int
429
+ main(int argc, char **argv)
430
+ {
431
+ ruby_init();
432
+ rb_eval_string("puts 'Hello, World!'");
433
+ ruby_finalize();
434
+ return 0;
435
+ }
436
+ ```
437
+
438
+ {:.center}
439
+ {::tag name="xx-small"}(For illustrative purposes only and non-functional){:/tag}
440
+
441
+ # Even CRuby can be embedded
442
+
443
+ ```bash
444
+ $ cc -o hello_world \
445
+ hello_world.c \
446
+ -I/path/to/ruby/include \
447
+ -L/path/to/ruby \
448
+ -l:libruby-static.a
449
+ # ^^^^^^^^^^^^^^^^
450
+
451
+ $ ./hello_world
452
+ #=> Hello, World!
453
+ ```
454
+
455
+ # Legenday work of embedded CRuby
456
+
457
+ ![](images/mod_ruby.png){:width="1660"}
458
+
459
+ # hide-title
460
+
461
+ {::tag name="xx-large"}Let's see the memory consumption of mrbc_lrama🧐{:/tag}
462
+
463
+ ## prop
464
+
465
+ hide-title
466
+ : true
467
+
468
+ # Memory consumption of mrbc_lrama
469
+
470
+ ```bash
471
+ --------------------------------------------------------------------------------
472
+ Command: bin/mrbc_lrama ../mruby-compiler2/fixtures/hello_world.rb
473
+ Massif arguments: --stacks=yes
474
+ ms_print arguments: massif.out.14624
475
+ --------------------------------------------------------------------------------
476
+ MB
477
+ 9.985^ :
478
+ | ::::::::::@:::::::::::@@::::::::::@@::@:::@::::::::::::::::@:#::::@:::
479
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
480
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
481
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
482
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
483
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
484
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
485
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
486
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
487
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
488
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
489
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
490
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
491
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
492
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
493
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
494
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
495
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
496
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
497
+ 0 +----------------------------------------------------------------------->Mi
498
+ 0 43.25
499
+ ```
500
+
501
+ # hide-title
502
+
503
+ {:.center}
504
+ {::tag name="xx-large"}9.985 MB😨{:/tag}
505
+
506
+ ## prop
507
+
508
+ hide-title
509
+ : true
510
+
511
+ # RAM consumption
512
+
513
+ |Compiler|RAM consumption|
514
+ |---------|------------------|
515
+ |mrbc (original)|129.5 KB|
516
+ |picorbc|16.70 KB|
517
+ |mrbc_prism|9.219 KB|
518
+ |mrbc_lrama|9.985 *MB*|
519
+
520
+ # Why consumes a lot of RAM?
521
+
522
+ ```c
523
+ mrc_irep *
524
+ compile_ruby_to_mruby_vm_code_with_lramagen(mrc_ccontext *c)
525
+ {
526
+ ruby_init(); // πŸ‘ˆπŸ‘ˆπŸ‘ˆπŸ‘ˆπŸ‘ˆπŸ‘ˆπŸ‘ˆπŸ‘ˆ
527
+ rb_parser_t *parser = rb_ruby_parser_new();
528
+ rb_ast_t *ast = rb_ruby_parser_compile_string(
529
+ parser, "", "puts 'Hello, World!'", 0);
530
+ mrc_irep *irep = mrc_load_exec(c, (mrc_node *)ast->body.root);
531
+ ruby_finalize();
532
+ return irep;
533
+ }
534
+ ```
535
+
536
+ {:.center}
537
+ {::tag name="xx-small"}(For illustrative purposes only and non-functional){:/tag}
538
+
539
+ {:.center}
540
+ *`ruby_init()`* prepares the CRuby's GC.
541
+ However, if you remove it, you get *SEGV*
542
+
543
+ # Dependencies (actual)
544
+
545
+ ![](images/dependencies-0.png){:width="1700"}
546
+
547
+ # Dependencies (goal)
548
+
549
+ ![](images/dependencies-0.png){:width="1700"
550
+ draw0="[text, πŸ™…πŸ», 0.36, 0.24, {size: 70, font_family: 'Noto Color Emoji'\}]"
551
+ draw1="[text, πŸ™…πŸ», 0.43, 0.49, {size: 70, font_family: 'Noto Color Emoji'\}]"
552
+ }
553
+
554
+ # *DeGC* is what we need
555
+
556
+ - DeVALUE
557
+ - *Decouple VALUE* that is a C-level representation of a Ruby object such as RString and RArray
558
+ - DeIMEMO
559
+ - *Decouple IMEMO* that is a C-level object taken care of by GC
560
+ - DeGC
561
+ - After all, we are going to *decouple GC* from the parser
562
+
563
+ # Relevant work of DeVALUE
564
+
565
+ ![](images/sh.png){:width="1500" relative_margin_top="5"}
566
+
567
+ {:.center}
568
+ Lightning Talks today
569
+
570
+ # rb_parser_ary_t* instead of RArray
571
+
572
+ ```c
573
+ enum rb_parser_ary_data_type {
574
+ PARSER_ARY_DATA_AST_TOKEN,
575
+ PARSER_ARY_DATA_SCRIPT_LINE
576
+ };
577
+
578
+ typedef struct rb_parser_ary {
579
+ enum rb_parser_ary_data_type data_type;
580
+ rb_parser_ary_data *data; // typedef of void*
581
+ long len; // current size
582
+ long capa; // capacity
583
+ } rb_parser_ary_t;
584
+ ```
585
+
586
+ # e.g. DeVALUE of tokens
587
+
588
+ ```ruby
589
+ node = RubyVM::AbstractSyntaxTree.
590
+ parse('puts "Hello, World!"',
591
+ keep_tokens: true) # πŸ‘ˆ
592
+
593
+ node.tokens
594
+ #=>
595
+ [[0, :tIDENTIFIER, "puts", [1, 0, 1, 4]],
596
+ [1, :tSP, " ", [1, 4, 1, 5]],
597
+ [2, :tSTRING_BEG, "'", [1, 5, 1, 6]],
598
+ [3, :tSTRING_CONTENT, "Hello, World!", [1, 6, 1, 19]],
599
+ [4, :tSTRING_END, "'", [1, 19, 1, 20]]]
600
+ ```
601
+
602
+ # e.g. DeVALUE of tokens
603
+
604
+ ```c
605
+ // parse.y
606
+ static void parser_append_tokens(struct parser_params *p,...
607
+ {
608
+ VALUE ary = rb_ary_new2(4); // A token
609
+ (...)
610
+ rb_ary_push(p->tokens, ary);
611
+ (...)
612
+ }
613
+
614
+ (...)
615
+
616
+ static VALUE yycompile0(VALUE arg)
617
+ {
618
+ (...)
619
+ if (p->keep_tokens) {
620
+ p->ast->node_buffer->tokens = tokens;
621
+ p->tokens = NULL;
622
+ }
623
+ (...)
624
+ }
625
+ ```
626
+
627
+ # e.g. DeVALUE of tokens
628
+
629
+ ```c
630
+ struct parser_params {
631
+ (...)
632
+ VALUE tokens;
633
+ }
634
+ ```
635
+
636
+ {:.center}
637
+ πŸ‘‡
638
+
639
+ ```c
640
+ struct parser_params {
641
+ (...)
642
+ rb_parser_ary_t *tokens;
643
+ }
644
+ ```
645
+
646
+ # e.g. DeVALUE of script_lines
647
+
648
+ - `ast->body.script_lines`
649
+ - `iseq->variable.script_lines`
650
+ - `SCRIPT_LINES__`
651
+
652
+ {:.center}
653
+ γ€€
654
+ γ€€
655
+ {::tag name="x-large"}*Three kinds of script_lines*{:/tag}
656
+
657
+ # script_lines in AST
658
+
659
+ ```ruby
660
+ node = RubyVM::AbstractSyntaxTree.
661
+ parse("puts 'Hello, World!'",
662
+ keep_script_lines: true) # πŸ‘ˆ
663
+
664
+ node.script_lines
665
+ #=>
666
+ ["puts 'Hello, World!'"]
667
+ ```
668
+
669
+ # script_lines in ISEQ
670
+
671
+ ```ruby
672
+ iseq = RubyVM::InstructionSequence.
673
+ compile("puts 'Hello, World!'",
674
+ keep_script_lines: true) # πŸ‘ˆ
675
+
676
+ iseq.script_lines
677
+ #=>
678
+ ["puts 'Hello, World!'"]
679
+ ```
680
+
681
+ # `SCRIPT_LINES__` in top-level
682
+
683
+ ```ruby
684
+ # hello_world.rb
685
+ puts 'Hello, World!'
686
+ ```
687
+
688
+ ```ruby
689
+ SCRIPT_LINES__ = {}
690
+ require './hello_world'
691
+ p SCRIPT_LINES__
692
+ #=>
693
+ {"./hello_world"=>["puts 'Hello, World!'"]}
694
+ ```
695
+
696
+ {:.center}
697
+ debug.rb and tracer.rb used to use `SCRIPT_LINES__`
698
+
699
+ # Internally, they were the same RArray
700
+
701
+ - `ast->body.script_lines`
702
+ - `iseq->variable.script_lines`
703
+ - `SCRIPT_LINES__[filename]`
704
+
705
+ {:.center}
706
+ γ€€
707
+ Build script_lines as a *`rb_parser_ary_t`*
708
+ during parsing, then convert it to a *VALUE*
709
+ or when *`Node#script_lines`* is called
710
+
711
+ # IMEMO (Internal MEMOry Object)
712
+
713
+ - IMEMO is a C-level object taken care of by GC
714
+ - Not a VALUE
715
+ - Five-word restriction to be GC-ed
716
+ - The first word has to be the flags
717
+ - You can only use the five words in total
718
+
719
+ # e.g. DeIMEMO of AST (before)
720
+
721
+ ```c
722
+ typedef struct rb_ast_struct {
723
+ VALUE flags; // To be removed
724
+ node_buffer_t *node_buffer;
725
+ struct {
726
+ const NODE *root;
727
+ VALUE *script_lines; // To be deVALUEd
728
+ signed int frozen_string_literal:2;
729
+ signed int coverage_enabled:2;
730
+ } body;
731
+ } rb_ast_t;
732
+ ```
733
+
734
+ {:.center}
735
+ {::tag name="x-large"}*Five words*{:/tag}
736
+
737
+ # e.g. DeIMEMO of AST (after)
738
+
739
+ ```c
740
+ typedef struct rb_ast_struct {
741
+ node_buffer_t *node_buffer;
742
+ struct {
743
+ const NODE *root;
744
+ rb_parser_ary_t *script_lines;
745
+ int line_count; // Separeted from script_lines
746
+ signed int frozen_string_literal:2;
747
+ signed int coverage_enabled:2;
748
+ } body;
749
+ #ifdef UNIVERSAL_PARSER
750
+ const rb_parser_config_t *config; // Moved from node_buffer_t
751
+ #endif
752
+ } rb_ast_t;
753
+ ```
754
+
755
+ {:.center}
756
+ {::tag name="x-large"}*πŸ€”...Still five words?*{:/tag}
757
+
758
+ # e.g. DeIMEMO of AST (after)
759
+
760
+ ```c
761
+ typedef struct rb_ast_struct {
762
+ node_buffer_t *node_buffer;
763
+ struct {
764
+ const NODE *root;
765
+ rb_parser_ary_t *script_lines;
766
+ int line_count; // One word in 32 bit
767
+ signed int frozen_string_literal:2;
768
+ signed int coverage_enabled:2;
769
+ } body;
770
+ #ifdef UNIVERSAL_PARSER
771
+ const rb_parser_config_t *config; // Moved from node_buffer_t
772
+ #endif
773
+ } rb_ast_t;
774
+ ```
775
+
776
+ {:.center}
777
+ {::tag name="x-large"}*Six words in 32 bitπŸ’‘*{:/tag}
778
+
779
+ # DeGC
780
+
781
+ {:.center}
782
+ γ€€
783
+ γ€€
784
+ γ€€
785
+ {::tag name="x-large"}DeVALUE and DeIMEMO{:/tag}
786
+ {::tag name="x-large"}automatically complete *DeGC*😊{:/tag}
787
+
788
+ # Amount of diff in ruby/ruby by me
789
+
790
+ ```bash
791
+ $ git log --author=hasumikin \
792
+ --since=2024-01-01 --until=2024-4-30 \
793
+ --oneline
794
+
795
+ ddd8da4b6b [Universal parser] Improve AST structure
796
+ 9ea77cb351 Remove unnecessary assignment to ast->body.line_count
797
+ 55a402bb75 Add line_count field to rb_ast_body_t
798
+ 2244c58b00 [Universal parser] Decouple IMEMO from rb_ast_t
799
+ 9b1e97b211 [Universal parser] DeVALUE of p->debug_lines and ast...
800
+ f5e387a300 Separate SCRIPT_LINES__ from ast.c
801
+ 8aa8fce320 Fix return-type warning in compile.c
802
+ ce544f8dbd [ruby/prism] [Compatibility] Improve printf format
803
+ 40ecad0ad7 [Universal parser] Fix -Wsuggest-attribute=format warnings
804
+ 9a19cfd4cd [Universal Parser] Reduce dependence on RArray in parse.y
805
+ b95e2cdca7 [ruby/prism] Additional fix of adding `x` prefix after ...
806
+ 54f27549e2 [ruby/prism] Chage some names
807
+ c4bd6da298 [ruby/prism] Make alloc interface replaceable
808
+ f0a46c6334 [ruby/prism] Include unistd.h before cheching _POSIX_...
809
+ 15b53e901c [ruby/prism] Use `_POSIX_MAPPED_FILES` and `_WIN32` to ...
810
+ ```
811
+
812
+ # Amount of diff in ruby/ruby by me
813
+
814
+ ```bash
815
+ $ git log --author=hasumikin \
816
+ --since=2024-01-01 --until=2024-4-30 \
817
+ --pretty=tformat: --numstat \
818
+ | awk '{add += $1; del += $2} END
819
+ {print "++:", add, "\n--:", del}'
820
+
821
+ ++: 1321
822
+ --: 818
823
+ ```
824
+
825
+ {:.center}
826
+ Despite 2000+ lines of changes,
827
+ CRuby does not alter its behavior.
828
+
829
+ # Let's try *`valgrind --tool=massif`* again🀞
830
+
831
+ ```c
832
+ mrc_irep *
833
+ compile_ruby_to_mruby_vm_code_with_lramagen(mrc_ccontext *c)
834
+ {
835
+ // ruby_init(); πŸ‘ˆ Remove this
836
+ rb_parser_t *parser = rb_ruby_parser_new();
837
+ rb_ast_t *ast = rb_ruby_parser_compile_string(
838
+ parser, "", "puts 'Hello, World!'", 0);
839
+ mrc_irep *irep = mrc_load_exec(c, (mrc_node *)ast->body.root);
840
+ // ruby_finalize(); πŸ‘ˆ Remove this
841
+ return irep;
842
+ }
843
+ ```
844
+
845
+ {:.center}
846
+ {::tag name="xx-small"}(For illustrative purposes only and non-functional){:/tag}
847
+
848
+ {:.center}
849
+ Now we can remove
850
+ *`ruby_init()`* and *`ruby_finalize()`*
851
+
852
+ # RAM consumption of Lrama-gen parser
853
+
854
+ ```bash
855
+ --------------------------------------------------------------------------------
856
+ Command: bin/mrbc_lrama ../mruby-compiler2/fixtures/hello_world.rb
857
+ Massif arguments: --stacks=yes
858
+ ms_print arguments: massif.out.26697
859
+ --------------------------------------------------------------------------------
860
+ KB
861
+ 39.91^ #
862
+ | @#:
863
+ | @#:
864
+ | @#:
865
+ | @#:
866
+ | @#:
867
+ | @#:
868
+ | @#:
869
+ | @#:
870
+ | @#:
871
+ | @#:
872
+ | @#:
873
+ | @#:
874
+ | @#:
875
+ | @#:
876
+ | @#:
877
+ | @#:
878
+ | @@ @#:
879
+ | @ : @#:
880
+ | @ :::: :@#:
881
+ 0 +----------------------------------------------------------------------->ki
882
+ 0 747.9
883
+ ```
884
+
885
+ # RAM consumption of Lrama-gen parser
886
+
887
+ |Compiler|RAM consumption|
888
+ |---------|------------------|
889
+ |mrbc (original)|129.5 KB|
890
+ |picorbc|16.70 KB|
891
+ |mrbc_prism|9.219 KB|
892
+ |mrbc_lrama|9.985 MB πŸ‘‰ *39.91 KB*πŸŽ‰|
893
+
894
+ {:.center}
895
+ γ€€
896
+ {::tag name="x-large"}*96% reduction!!!*{:/tag}
897
+ (Ready to fight with Prism)
898
+
899
+ # Finally, DeGCed!!!
900
+
901
+ ![](images/dependencies-1.png){:width="1700"}
902
+
903
+ # Remaining issues of Lrama-gen parser
904
+
905
+ - Make NODE(AST) design compatible with Prism
906
+ - Fine-tuning of the RAM consumption
907
+ - 39.91 KB is the result of just *deGCed*
908
+ - There is still more that can be doneπŸ’ͺ
909
+ - *ROM size* and *cross compilation*
910
+
911
+ # ROM size also matters
912
+
913
+ ```bash
914
+ $ ls -lSr picoruby/bin/
915
+ -rwxr-xr-x 851648 picorbc
916
+ -rwxr-xr-x 2512528 mrbc_prism
917
+ -rwxr-xr-x 3386824 mrbc # original
918
+ -rwxr-xr-x 25080800 mrbc_lrama # 24 MB😡
919
+ ```
920
+
921
+ - picorbc solely consists of mruby compiler {::tag name="xx-small"}(ascii only){:/tag}
922
+ - mrbc_prism contains all encodings {::tag name="x-small"}(can be omitted){:/tag}
923
+ - mrbc contains mruby-core
924
+ - mrbc_lrama contains CRuby (miniruby)
925
+
926
+ # mrbc_lrama: 24 MB (on x86-64)
927
+
928
+ ![](images/dependencies-2.png){:width="1700"}
929
+
930
+ # To fix, we need cross compilation
931
+
932
+ ```bash
933
+ $ cd ruby
934
+ $ ./autogen.sh
935
+ $ ./configure
936
+ #=> .ext/include/x86_64-linux/ruby/config.h
937
+
938
+ $ ./configure --host=arm-linux-eabi
939
+ #=> .ext/include/arm-linux-eabi/ruby/config.h
940
+ ```
941
+
942
+ {:.center}
943
+ γ€€
944
+ {::tag name="x-large"}*What is `ruby/config.h`?*{:/tag}
945
+
946
+ # *`ruby/config.h`* is platform-specific config
947
+
948
+ ```c
949
+ // .ext/include/arm-linux-eabi/ruby/config.h
950
+ ...
951
+ #define RUBY_ABI_VERSION 0
952
+ #define HAVE_STDIO_H 1
953
+ ...
954
+ #define SIZEOF_INT 4
955
+ #define SIZEOF_SHORT 2
956
+ ...
957
+ #define SIZEOF_VOIDP 4 // πŸ‘ˆ 32 bit
958
+ #define SIZEOF_FLOAT 4
959
+ #define SIZEOF_DOUBLE 8
960
+ ...
961
+ ```
962
+
963
+ # Target triplets {::tag name="x-small"}[CPU]-[ambiguous]-[ambiguous]{:/tag}
964
+
965
+ - x86-64-linux-gnu:
966
+ - x86 64 bit Linux with GNU libc
967
+ - aarch64-linux-gnu:
968
+ - Arm 64 bit Linux with GNU libc
969
+ - arm-linux-gnueabi(hf):
970
+ - Arm 32 bit Linux with GNU libc for EABI (hard-float)
971
+ EABI: Embedded Application Binary Interface
972
+ - arm-none-eabi:
973
+ - Arm 32 bit *bare-metal* with newlib
974
+
975
+ # arm-none-eabi
976
+
977
+ - RP2040 (Raspberry Pi Pico)
978
+ - 133 MHz Arm Cortex-M0+ (dual)
979
+
980
+ ![](images/rpi_pico.jpg){:align="right" height="800"}
981
+
982
+ # Configurable with arm-none-eabi? *No*
983
+
984
+ ```bash
985
+ $ cd ruby
986
+ $ ./autogen.sh #=> configure
987
+ $ ./configure --host=arm-none-eabi
988
+ #=> error😒😒😒😒😒😒
989
+ ```
990
+
991
+ {:.center}
992
+ γ€€
993
+ {::tag name="x-large"}*Why?*{:/tag}
994
+ Because "none-eabi" doesn't support
995
+ things like *`stdio.h`* that CRuby requires
996
+
997
+ # The next step is to make an effective build
998
+
999
+ - Configurable against bare-metal
1000
+ - Arm, ESP32, and PIC32, etc.
1001
+ - Compile only what we need instead of making libruby-static.a
1002
+ - `parse.o` + `node.o` + `parser_st.o` + *`?.o`*
1003
+ - Resolving the lack of dependencies is another challenge
1004
+
1005
+ # Still some deVALUE to do
1006
+
1007
+ ![](images/dependencies-3.png){:width="1700"}
1008
+
1009
+ # Functions to be deVALUEd
1010
+
1011
+ ```c
1012
+ VALUE rb_node_file_path_val(const NODE *node);
1013
+ VALUE rb_node_str_string_val(const NODE *node);
1014
+ VALUE rb_node_line_lineno_val(const NODE *node);
1015
+ VALUE rb_node_encoding_val(const NODE *node);
1016
+ VALUE rb_node_integer_literal_val(const NODE *n);
1017
+ VALUE rb_node_float_literal_val(const NODE *n);
1018
+ VALUE rb_node_rational_literal_val(const NODE *n);
1019
+ VALUE rb_node_imaginary_literal_val(const NODE *n);
1020
+ VALUE rb_node_regx_string_val(const NODE *node);
1021
+ VALUE rb_node_sym_string_val(const NODE *node);
1022
+ VALUE rb_str_new_parser_string(rb_parser_string_t *str);
1023
+ VALUE rb_str_new_mutable_parser_string(rb_parser_string_t *str);
1024
+ VALUE rb_sym2id(VALUE sym);
1025
+ ```
1026
+
1027
+ {:.center}
1028
+ Further deVALUE is needed
1029
+
1030
+ # Build with dummy functions (PoC)
1031
+
1032
+ ```bash
1033
+ # Linking libruby-static.a
1034
+ -rwxr-xr-x 2512528 mrbc_prism
1035
+ -rwxr-xr-x 25080800 mrbc_lrama 😡
1036
+ πŸ‘‡
1037
+ # Linking dummy functions
1038
+ -rwxr-xr-x 2512528 mrbc_prism
1039
+ -rwxr-xr-x 1927440 mrbc_lrama πŸ₯³
1040
+ ```
1041
+
1042
+ {:.center}
1043
+ Now mrbc_lrama is barely able to compile
1044
+ *`puts 'Hello, World!'`*
1045
+
1046
+ # hide-title
1047
+
1048
+ {:.center}
1049
+ {::tag name="large"}*Working in progress...*{:/tag}
1050
+ γ€€
1051
+ γ€€
1052
+ {::tag name="small"}`https://github.com/picoruby/mruby-compiler2`{:/tag}
1053
+ γ€€
1054
+ {::tag name="small"}`https://github.com/picoruby/mruby-bin-mrbc2`{:/tag}
1055
+
1056
+ ## prop
1057
+
1058
+ hide-title
1059
+ : true
1060
+
1061
+ # FYI, PicoRuby talk at #rubykaigiC 16:40
1062
+
1063
+ ![](images/s01.png){:width="1500" relative_margin_top="5"}
1064
+
1065
+ # Also, PicoRuby appears in Lightning Talks
1066
+
1067
+ ![](images/hachi.png){:width="1500" relative_margin_top="5"}
1068
+
1069
+ # This guy may give you a Raspi Pico
1070
+
1071
+ ![](images/yancya.png){:height="700" relative_margin_top="0"}
1072
+
1073
+ {:.center}
1074
+ {::tag name="xx-small"}https://twitter.com/hanachin_/status/1790713159906681259{:/tag}
1075
+ *Catch him at the venue*
1076
+
1077
+ # Lrama-gen univ-parser is almost there!!!
1078
+
1079
+ - Embed libruby-static.a in PicoRuby compiler
1080
+ - β†’ RAM consumption: *10 MB* 😨
1081
+ - β†’ ROM size: *24 MB* 😡
1082
+ - β†’ DeVALUE, DeIMEMO, DeGC
1083
+ - β†’ RAM consumption: *40 KB* πŸ₯³
1084
+ - β†’ Improve build process
1085
+ - β†’ ROM size: *1.9 MB* 😊
1086
+
1087
+ # Wrap-up
1088
+
1089
+ - PicoRuby compiler will integrate
1090
+ both Prism and Lrama-gen parser
1091
+ - Rest of the work
1092
+ - Cross compile
1093
+ - DeVALUE (a little more)
1094
+ - Symbol table (ID)
1095
+ - Fine-tuning
1096
+
1097
+ ![](images/QR_github-com-picoruby-picoruby.png){:
1098
+ align="right" width="600" relative_margin_top="3" relative_margin_left="0"
1099
+ draw0="[text, github.com/picoruby/picoruby, -0.14, 1.01, {color: black, size: 34, font_family: 'Courier Prime'\}]"
1100
+ }
1101
+
1102
+
1103
+