rabbit-slide-hasumikin-RubyKaigi2024 2024.05.16.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,1103 @@
1
+ # Unlock The Universal Parsers
2
+
3
+ subtitle
4
+ : A New PicoRuby Compiler
5
+
6
+ author
7
+ : @hasumikin
8
+
9
+ institution
10
+ : Monstarlab
11
+
12
+ allotted-time
13
+ : 30m
14
+
15
+ theme
16
+ : rubykaigi-2024
17
+
18
+ # [AD] "n-monthly Lambda Note" Vol.4, No.1
19
+
20
+ γ€€
21
+ γ€€
22
+ **Learning How
23
+ Programming Languages
24
+ Control Electrical Circuits
25
+ with PicoRuby**
26
+
27
+ {::tag name="small"}(Japanese only){:/tag}
28
+
29
+ 1,760 yen {::tag name="x-small"}(tax included){:/tag}
30
+
31
+ ![](images/lambdanote.png){:
32
+ align="right"
33
+ width="550"
34
+ relative_margin_top="-2"
35
+ relative_margin_right="5"
36
+ }
37
+
38
+ # CRuby's parser
39
+
40
+ - parse.*y* and prism.*c*: human-writable
41
+ - parse.*c*: generated by Bison or Lrama
42
+ - prism.*c*: no generator needed
43
+
44
+ ```mermaid
45
+ graph LR;
46
+ parse_y(parse.y) -->|Bison| parse_c_1[parse.c #40;retired#41;]
47
+ parse_y(parse.y) -->|Lrama| parse_c_2[lrama-generated parse.c]
48
+ prism(prism.c)
49
+ ```
50
+ {:relative_height="100"}
51
+
52
+ # Prism and Lrama-generated parser
53
+
54
+ - Prism
55
+ - Newly developed universal Ruby parser
56
+ - Portable and small footprint
57
+ - Easy to integrate
58
+ - Available in form of a gem
59
+ - Lrama-generated parser (parse.c)
60
+ - 100% compliant with CRuby in any aspect
61
+ - Not portable by nature
62
+
63
+ # Parsers in Ruby (as of May 2024)
64
+
65
+ |Ruby|Parser|
66
+ |---------|--------|
67
+ |CRuby (default)|Lrama-generated|
68
+ |CRuby (experimental)|Prism|
69
+ |JRuby|Prism|
70
+ |TruffleRuby|Prism|
71
+ |*mruby*|*by Matz*|
72
+ |*PicoRuby*|*by hasumikin (me)*|
73
+ |IRB(katakata_irb)|(gem) Prism|
74
+ |RuboCop|(gem) Parser β†’ Prism|
75
+
76
+ # Prism in RuboCop
77
+
78
+ ![](images/koic.png){:width="1500" relative_margin_top="0"}
79
+
80
+ # Parsers in Ruby *(I'm working on)*
81
+
82
+ |Ruby|Parser|
83
+ |---------|--------|
84
+ |CRuby (default)|Lrama-generated|
85
+ |CRuby (optional)|Prism|
86
+ |JRuby|Prism|
87
+ |TruffleRuby|Prism|
88
+ |*mruby*|*???????*|
89
+ |*PicoRuby*|*???????*|
90
+ |IRB(katakata_irb)|(gem) Prism|
91
+ |RuboCop|(gem) Parser β†’ Prism|
92
+
93
+ # mruby compiler generates VM code
94
+
95
+ ```bash
96
+ $ echo 'puts "Hello, World!"' | bin/mrbc - | xxd # hexdump
97
+ 00000000: 5249 5445 3033 3030 0000 005e 4d41 545a RITE0300...^MATZ
98
+ 00000010: 3030 3030 4952 4550 0000 0042 3033 3030 0000IREP...B0300
99
+ 00000020: 0000 0036 0000 0004 0000 0000 0000 000a ...6............
100
+ 00000030: 5102 002d 0100 0138 0169 0001 0000 0d48 Q..-...8.i.....H
101
+ 00000040: 656c 6c6f 2c20 576f 726c 6421 0000 0100 ello, World!....
102
+ 00000050: 0470 7574 7300 454e 4400 0000 0008 .puts.END.....
103
+
104
+ $ echo 'puts "Hello, World!"' | bin/mrbc - | bin/mruby
105
+ Hello, World!
106
+ ```
107
+
108
+ {:.center}
109
+ (PicoRuby compiler does the same)
110
+
111
+ # PicoRuby compiler must be small footprint
112
+
113
+ - Raspberry Pi Pico
114
+ - 133 MHz Arm Cortex-M0+ (dual)
115
+ - *264 KB SRAM*
116
+ - 2 MB Flash ROM
117
+
118
+ ![](images/rpi_pico.jpg){:align="right" height="800"}
119
+
120
+ # PicoRuby compiler must be small footprint
121
+
122
+ - Raspberry Pi Pico
123
+ - 133 MHz Arm Cortex-M0+ (dual)
124
+ - *264 KB SRAM*
125
+ - 2 MB Flash ROM
126
+
127
+ {:.center}
128
+ γ€€
129
+ {::tag name="x-large"}*If a universal parser is not small enough, it won't be "universal"*{:/tag}
130
+
131
+ ![](images/rpi_pico.jpg){:align="right" height="800"}
132
+
133
+ # PicoRuby compiler must be small footprint
134
+
135
+ ```bash
136
+ $ valgrind \
137
+ --tool=massif \
138
+ --stacks=yes \
139
+ path/to/mrbc hello_world.rb
140
+ #=> massif.out.12345
141
+
142
+ $ ms_print massif.out.12345 | less
143
+ ```
144
+
145
+ # The original mruby Compiler
146
+
147
+ ```bash
148
+ --------------------------------------------------------------------------------
149
+ Command: bin/mrbc ../mruby-compiler2/fixtures/hello_world.rb
150
+ Massif arguments: --stacks=yes
151
+ ms_print arguments: massif.out.29290
152
+ --------------------------------------------------------------------------------
153
+ KB
154
+ 129.5^ :
155
+ | #
156
+ | #
157
+ | #
158
+ | #
159
+ | #
160
+ | @#
161
+ | @#
162
+ | @#
163
+ | @#
164
+ | ::@@@@@:::@@:::::::@@:@::@:@#:
165
+ | @@@::::::::::::::::::@:@::: @@ @ :::@ ::: :::@@:@::@:@#:
166
+ | @ : :::: ::::: :::: @:@::: @@ @ :::@ ::: :::@@:@::@:@#:
167
+ | @ : :::: ::::: :::: @:@::: @@ @ :::@ ::: :::@@:@::@:@#:
168
+ | @ : :::: ::::: :::: @:@::: @@ @ :::@ ::: :::@@:@::@:@#:
169
+ | @ : :::: ::::: :::: @:@::: @@ @ :::@ ::: :::@@:@::@:@#:
170
+ | @ : :::: ::::: :::: @:@::: @@ @ :::@ ::: :::@@:@::@:@#:
171
+ | @ : :::: ::::: :::: @:@::: @@ @ :::@ ::: :::@@:@::@:@#:
172
+ | @ : :::: ::::: :::: @:@::: @@ @ :::@ ::: :::@@:@::@:@#:
173
+ | @@ @ : :::: ::::: :::: @:@::: @@ @ :::@ ::: :::@@:@::@:@#:
174
+ 0 +----------------------------------------------------------------------->Mi
175
+ 0 1.030
176
+ ```
177
+
178
+ # Existing PicoRuby Compiler
179
+
180
+ ```bash
181
+ --------------------------------------------------------------------------------
182
+ Command: bin/picorbc ../mruby-compiler2/fixtures/hello_world.rb
183
+ Massif arguments: --stacks=yes
184
+ ms_print arguments: massif.out.28004
185
+ --------------------------------------------------------------------------------
186
+ KB
187
+ 16.70^ :
188
+ | #::::::::::@
189
+ | #::::::::::@:
190
+ | #::::::::::@:
191
+ | #::::::::::@:
192
+ | #::::::::::@:
193
+ | #::::::::::@::
194
+ | #::::::::::@::
195
+ | #::::::::::@::
196
+ | #::::::::::@::
197
+ | #::::::::::@::
198
+ | #::::::::::@::
199
+ | #::::::::::@::
200
+ | :: #::::::::::@::
201
+ | :: : #::::::::::@::
202
+ | :: : #::::::::::@::
203
+ | :: : #::::::::::@::
204
+ | :: : #::::::::::@::
205
+ | ::::: : :::::: @@ :::: : #::::::::::@::
206
+ | :::: ::::::::::::::::::::::: :@ ::::::::@::::@#::::::::::@::
207
+ 0 +----------------------------------------------------------------------->ki
208
+ 0 219.6
209
+ ```
210
+
211
+ # mrbc_prism
212
+
213
+ ```bash
214
+ --------------------------------------------------------------------------------
215
+ Command: bin/mrbc_prism ../mruby-compiler2/fixtures/hello_world.rb
216
+ Massif arguments: --stacks=yes
217
+ ms_print arguments: massif.out.31071
218
+ --------------------------------------------------------------------------------
219
+ KB
220
+ 9.219^ #
221
+ | #
222
+ | #
223
+ | #
224
+ | @@#
225
+ | @ #
226
+ | :: @ #
227
+ | :: @ #
228
+ | :: : @ #:
229
+ | :: : @ #:
230
+ | :: : @ #:
231
+ | :: : @ #:
232
+ | :: : @ #:
233
+ | :: : @ #:
234
+ | :: : @ #:
235
+ | :::: : @ #:
236
+ | :::: : ::::: ::: ::@ #:
237
+ | @:::::: :::::::: :::: ::@:::::: ::: :::: ::::::@ #::
238
+ | :@::::::::::::::::::::: :::::: @: : @: ::::::: ::::::@ #::
239
+ | ::@::::::: ::::::::: :::::: @: : @: :: :::: ::::::@ #::
240
+ 0 +----------------------------------------------------------------------->ki
241
+ 0 201.0
242
+ ```
243
+
244
+ # RAM consumption
245
+
246
+ |Compiler|RAM consumption|
247
+ |---------|------------------|
248
+ |mrbc (original)|129.5 KB|
249
+ |picorbc|16.70 KB|
250
+ |mrbc_prism|9.219 KB*\**|
251
+
252
+ {:.right}
253
+ {::tag name="xx-small"}Figures are measured in 64 bit Linux{:/tag}
254
+ γ€€
255
+ {:.left}
256
+ {::tag name="xx-small"}*\** mrbc_prism's small footprint is also due to a new code generator and possibly going to be bigger with development{:/tag}
257
+
258
+ # mrbc_prism
259
+
260
+ ```c
261
+ mrc_irep *
262
+ compile_ruby_to_mruby_vm_code_with_prism(mrc_ccontext *c)
263
+ {
264
+ pm_parser_t parser;
265
+ uint8_t *string = "puts 'Hello, World!'"
266
+ pm_parser_init(&parser,
267
+ string, strlen((const char *)string),
268
+ NULL);
269
+ pm_node_t *root = pm_parse(&parser);
270
+ mrc_irep *irep = mrc_load_exec(c, (mrc_node *)root);
271
+ pm_node_destroy(&parser, root);
272
+ pm_parser_free(&parser);
273
+ return irep;
274
+ }
275
+ ```
276
+
277
+ {:.center}
278
+ {::tag name="xx-small"}(For illustrative purposes only and non-functional){:/tag}
279
+
280
+ # Prism, really good stuff
281
+
282
+ - Well designed APIs and NODE
283
+ - Portable
284
+ - Small footprint
285
+
286
+ {:.center}
287
+ γ€€
288
+ {::tag name="x-large"}*Still, I'm going to implement Lrama-generated parser.*{:/tag}
289
+ {::tag name="x-large"}*Why?*{:/tag}
290
+
291
+ # In Taipei, Taiwan last year...
292
+
293
+ ![](images/taiwan.jpg){:width="1200"
294
+ draw0="[text, οΌ spikeolaf, 0.34, 0.4, {color: white, size: 40, font_family: 'Poppins', weight: bold\}]"
295
+ draw1="[text, οΌ koic, 0.82, 0.3, {color: white, size: 40, font_family: 'Poppins', weight: bold\}]"
296
+ draw2="[text, me, 0.05, 0.23, {color: white, size: 40, font_family: 'Poppins', weight: bold\}]"
297
+ }
298
+
299
+ # A Labyrinth
300
+
301
+ ![](images/labyrinth.jpg){:width="1980" height="1250" relative_margin_right="-10" relative_margin_top="-6" align="right"
302
+ draw0="[text, parse.y is a labyrinth, 0.11, 0.7, {color: green, size: 100, font_family: 'Dela Gothic One'\}]"
303
+ }
304
+
305
+ # @ydah_
306
+
307
+ ![](images/ydah_.png){:width="1500" }
308
+
309
+ # @junk0612
310
+
311
+ ![](images/junk0612.png){:width="1500" }
312
+
313
+ # no longer a Labyrinth
314
+
315
+ ![](images/nolongerlabyrinth.jpg){:width="1980" height="1250" relative_margin_right="-10" relative_margin_top="-6" align="right"
316
+ draw0="[text, No longer a labyrinth, 0.11, 0.7, {color: cyan, size: 100, font_family: 'Dela Gothic One'\}]"
317
+ }
318
+
319
+ # Why Lrama-generated parser?
320
+
321
+ - Making parse.y readable and maintainable
322
+ - Integrated error tolerance
323
+ - Innovative state management of the tokenizer
324
+ - In short, a new era of LALR parser generator
325
+
326
+ {:.center}
327
+ γ€€
328
+ {::tag name="x-large"}*Computer-science-wide achievement*{:/tag}
329
+
330
+ # hide-title
331
+
332
+ {:.center}
333
+ {::tag name="xx-large"}OK, let me use both{:/tag}
334
+ {::tag name="xx-large"}*Prism* and *Lrama-gen parser*{:/tag}
335
+ {::tag name="xx-large"}in my PicoRuby compilerπŸ’ͺ{:/tag}
336
+
337
+ ## prop
338
+
339
+ hide-title
340
+ : true
341
+
342
+ # Parsers in Ruby *(I'm working on)*
343
+
344
+ |Ruby|Parser|
345
+ |---------|--------|
346
+ |CRuby (default)|Lrama-generated|
347
+ |CRuby (optional)|Prism|
348
+ |JRuby|Prism|
349
+ |TruffleRuby|Prism|
350
+ |*mruby*|*Prism and Lrama-gen*|
351
+ |*PicoRuby*|*Prism and Lrama-gen*|
352
+ |IRB(katakata_irb)|(gem) Prism|
353
+ |RuboCop|(gem) Parser β†’ Prism|
354
+
355
+ # AST of Lrama-generated parser
356
+
357
+ ```bash
358
+ $ ruby --dump=p -e'puts "Hello, World!"'
359
+ ###########################################################
360
+ ## Do NOT use this node dump for any purpose other than ##
361
+ ## debug and research. Compatibility is not guaranteed. ##
362
+ ###########################################################
363
+
364
+ # @ NODE_SCOPE (id: 3, line: 1, location: (1,0)-(1,20))
365
+ # +- nd_tbl: (empty)
366
+ # +- nd_args:
367
+ # | (null node)
368
+ # +- nd_body:
369
+ # @ NODE_FCALL (id: 0, line: 1, location: (1,0)-(1,20))*
370
+ # +- nd_mid: :puts
371
+ # +- nd_args:
372
+ # @ NODE_LIST (id: 2, line: 1, location: (1,5)-(1,20))
373
+ # +- as.nd_alen: 1
374
+ # +- nd_head:
375
+ # | @ NODE_STR (id: 1, line: 1, location: (1,5)-(1,20))
376
+ # | +- nd_lit: "Hello, World!"
377
+ # +- nd_next:
378
+ # (null node)
379
+ ```
380
+
381
+ # hide-title
382
+
383
+ {::tag name="xx-large"}Version 0.0.0.0.1 of mrbc_lrama{:/tag}
384
+ {::tag name="xx-small"}(A mruby compiler using Lrama-generated parser){:/tag}
385
+
386
+ ## prop
387
+
388
+ hide-title
389
+ : true
390
+
391
+ # How to integrate Lrama-gen parser
392
+
393
+ ```bash
394
+ $ cd ruby/ruby
395
+ $ ./autogen.sh #=> configure
396
+ $ ./configure #=> common.mk etc.
397
+ $ make
398
+ $ ls -l *.a
399
+ -rw-r--r-- 1 145797150 libruby-static.a
400
+ ```
401
+
402
+ {:.center}
403
+ {::tag name="x-large"}An archive file containing all the object files of the Ruby interpreter{:/tag}
404
+
405
+ # libruby-static.a
406
+
407
+ ```mermaid
408
+ flowchart LR
409
+ parse.y -->|Lrama| parse.c --> parse.o
410
+ node.c --> node.o
411
+ string.c --> string.o
412
+ subgraph libruby-static.a
413
+ parse.o
414
+ node.o
415
+ string.o
416
+ etc.
417
+ end
418
+ libHOGE.so --> ruby
419
+ libruby-static.a ---> ruby
420
+ ```
421
+ {:relative_height="100"}
422
+
423
+ # Even CRuby can be embedded
424
+
425
+ ```c
426
+ // hello_world.c
427
+ #include <ruby.h>
428
+ int
429
+ main(int argc, char **argv)
430
+ {
431
+ ruby_init();
432
+ rb_eval_string("puts 'Hello, World!'");
433
+ ruby_finalize();
434
+ return 0;
435
+ }
436
+ ```
437
+
438
+ {:.center}
439
+ {::tag name="xx-small"}(For illustrative purposes only and non-functional){:/tag}
440
+
441
+ # Even CRuby can be embedded
442
+
443
+ ```bash
444
+ $ cc -o hello_world \
445
+ hello_world.c \
446
+ -I/path/to/ruby/include \
447
+ -L/path/to/ruby \
448
+ -l:libruby-static.a
449
+ # ^^^^^^^^^^^^^^^^
450
+
451
+ $ ./hello_world
452
+ #=> Hello, World!
453
+ ```
454
+
455
+ # Legenday work of embedded CRuby
456
+
457
+ ![](images/mod_ruby.png){:width="1660"}
458
+
459
+ # hide-title
460
+
461
+ {::tag name="xx-large"}Let's see the memory consumption of mrbc_lrama🧐{:/tag}
462
+
463
+ ## prop
464
+
465
+ hide-title
466
+ : true
467
+
468
+ # Memory consumption of mrbc_lrama
469
+
470
+ ```bash
471
+ --------------------------------------------------------------------------------
472
+ Command: bin/mrbc_lrama ../mruby-compiler2/fixtures/hello_world.rb
473
+ Massif arguments: --stacks=yes
474
+ ms_print arguments: massif.out.14624
475
+ --------------------------------------------------------------------------------
476
+ MB
477
+ 9.985^ :
478
+ | ::::::::::@:::::::::::@@::::::::::@@::@:::@::::::::::::::::@:#::::@:::
479
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
480
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
481
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
482
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
483
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
484
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
485
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
486
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
487
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
488
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
489
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
490
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
491
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
492
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
493
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
494
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
495
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
496
+ | :::::::::: @:::: :::: :@@:: :: ::::@ ::@ : @:::: :: ::::::::@:#::::@:::
497
+ 0 +----------------------------------------------------------------------->Mi
498
+ 0 43.25
499
+ ```
500
+
501
+ # hide-title
502
+
503
+ {:.center}
504
+ {::tag name="xx-large"}9.985 MB😨{:/tag}
505
+
506
+ ## prop
507
+
508
+ hide-title
509
+ : true
510
+
511
+ # RAM consumption
512
+
513
+ |Compiler|RAM consumption|
514
+ |---------|------------------|
515
+ |mrbc (original)|129.5 KB|
516
+ |picorbc|16.70 KB|
517
+ |mrbc_prism|9.219 KB|
518
+ |mrbc_lrama|9.985 *MB*|
519
+
520
+ # Why consumes a lot of RAM?
521
+
522
+ ```c
523
+ mrc_irep *
524
+ compile_ruby_to_mruby_vm_code_with_lramagen(mrc_ccontext *c)
525
+ {
526
+ ruby_init(); // πŸ‘ˆπŸ‘ˆπŸ‘ˆπŸ‘ˆπŸ‘ˆπŸ‘ˆπŸ‘ˆπŸ‘ˆ
527
+ rb_parser_t *parser = rb_ruby_parser_new();
528
+ rb_ast_t *ast = rb_ruby_parser_compile_string(
529
+ parser, "", "puts 'Hello, World!'", 0);
530
+ mrc_irep *irep = mrc_load_exec(c, (mrc_node *)ast->body.root);
531
+ ruby_finalize();
532
+ return irep;
533
+ }
534
+ ```
535
+
536
+ {:.center}
537
+ {::tag name="xx-small"}(For illustrative purposes only and non-functional){:/tag}
538
+
539
+ {:.center}
540
+ *`ruby_init()`* prepares the CRuby's GC.
541
+ However, if you remove it, you get *SEGV*
542
+
543
+ # Dependencies (actual)
544
+
545
+ ![](images/dependencies-0.png){:width="1700"}
546
+
547
+ # Dependencies (goal)
548
+
549
+ ![](images/dependencies-0.png){:width="1700"
550
+ draw0="[text, πŸ™…πŸ», 0.36, 0.24, {size: 70, font_family: 'Noto Color Emoji'\}]"
551
+ draw1="[text, πŸ™…πŸ», 0.43, 0.49, {size: 70, font_family: 'Noto Color Emoji'\}]"
552
+ }
553
+
554
+ # *DeGC* is what we need
555
+
556
+ - DeVALUE
557
+ - *Decouple VALUE* that is a C-level representation of a Ruby object such as RString and RArray
558
+ - DeIMEMO
559
+ - *Decouple IMEMO* that is a C-level object taken care of by GC
560
+ - DeGC
561
+ - After all, we are going to *decouple GC* from the parser
562
+
563
+ # Relevant work of DeVALUE
564
+
565
+ ![](images/sh.png){:width="1500" relative_margin_top="5"}
566
+
567
+ {:.center}
568
+ Lightning Talks today
569
+
570
+ # rb_parser_ary_t* instead of RArray
571
+
572
+ ```c
573
+ enum rb_parser_ary_data_type {
574
+ PARSER_ARY_DATA_AST_TOKEN,
575
+ PARSER_ARY_DATA_SCRIPT_LINE
576
+ };
577
+
578
+ typedef struct rb_parser_ary {
579
+ enum rb_parser_ary_data_type data_type;
580
+ rb_parser_ary_data *data; // typedef of void*
581
+ long len; // current size
582
+ long capa; // capacity
583
+ } rb_parser_ary_t;
584
+ ```
585
+
586
+ # e.g. DeVALUE of tokens
587
+
588
+ ```ruby
589
+ node = RubyVM::AbstractSyntaxTree.
590
+ parse('puts "Hello, World!"',
591
+ keep_tokens: true) # πŸ‘ˆ
592
+
593
+ node.tokens
594
+ #=>
595
+ [[0, :tIDENTIFIER, "puts", [1, 0, 1, 4]],
596
+ [1, :tSP, " ", [1, 4, 1, 5]],
597
+ [2, :tSTRING_BEG, "'", [1, 5, 1, 6]],
598
+ [3, :tSTRING_CONTENT, "Hello, World!", [1, 6, 1, 19]],
599
+ [4, :tSTRING_END, "'", [1, 19, 1, 20]]]
600
+ ```
601
+
602
+ # e.g. DeVALUE of tokens
603
+
604
+ ```c
605
+ // parse.y
606
+ static void parser_append_tokens(struct parser_params *p,...
607
+ {
608
+ VALUE ary = rb_ary_new2(4); // A token
609
+ (...)
610
+ rb_ary_push(p->tokens, ary);
611
+ (...)
612
+ }
613
+
614
+ (...)
615
+
616
+ static VALUE yycompile0(VALUE arg)
617
+ {
618
+ (...)
619
+ if (p->keep_tokens) {
620
+ p->ast->node_buffer->tokens = tokens;
621
+ p->tokens = NULL;
622
+ }
623
+ (...)
624
+ }
625
+ ```
626
+
627
+ # e.g. DeVALUE of tokens
628
+
629
+ ```c
630
+ struct parser_params {
631
+ (...)
632
+ VALUE tokens;
633
+ }
634
+ ```
635
+
636
+ {:.center}
637
+ πŸ‘‡
638
+
639
+ ```c
640
+ struct parser_params {
641
+ (...)
642
+ rb_parser_ary_t *tokens;
643
+ }
644
+ ```
645
+
646
+ # e.g. DeVALUE of script_lines
647
+
648
+ - `ast->body.script_lines`
649
+ - `iseq->variable.script_lines`
650
+ - `SCRIPT_LINES__`
651
+
652
+ {:.center}
653
+ γ€€
654
+ γ€€
655
+ {::tag name="x-large"}*Three kinds of script_lines*{:/tag}
656
+
657
+ # script_lines in AST
658
+
659
+ ```ruby
660
+ node = RubyVM::AbstractSyntaxTree.
661
+ parse("puts 'Hello, World!'",
662
+ keep_script_lines: true) # πŸ‘ˆ
663
+
664
+ node.script_lines
665
+ #=>
666
+ ["puts 'Hello, World!'"]
667
+ ```
668
+
669
+ # script_lines in ISEQ
670
+
671
+ ```ruby
672
+ iseq = RubyVM::InstructionSequence.
673
+ compile("puts 'Hello, World!'",
674
+ keep_script_lines: true) # πŸ‘ˆ
675
+
676
+ iseq.script_lines
677
+ #=>
678
+ ["puts 'Hello, World!'"]
679
+ ```
680
+
681
+ # `SCRIPT_LINES__` in top-level
682
+
683
+ ```ruby
684
+ # hello_world.rb
685
+ puts 'Hello, World!'
686
+ ```
687
+
688
+ ```ruby
689
+ SCRIPT_LINES__ = {}
690
+ require './hello_world'
691
+ p SCRIPT_LINES__
692
+ #=>
693
+ {"./hello_world"=>["puts 'Hello, World!'"]}
694
+ ```
695
+
696
+ {:.center}
697
+ debug.rb and tracer.rb used to use `SCRIPT_LINES__`
698
+
699
+ # Internally, they were the same RArray
700
+
701
+ - `ast->body.script_lines`
702
+ - `iseq->variable.script_lines`
703
+ - `SCRIPT_LINES__[filename]`
704
+
705
+ {:.center}
706
+ γ€€
707
+ Build script_lines as a *`rb_parser_ary_t`*
708
+ during parsing, then convert it to a *VALUE*
709
+ or when *`Node#script_lines`* is called
710
+
711
+ # IMEMO (Internal MEMOry Object)
712
+
713
+ - IMEMO is a C-level object taken care of by GC
714
+ - Not a VALUE
715
+ - Five-word restriction to be GC-ed
716
+ - The first word has to be the flags
717
+ - You can only use the five words in total
718
+
719
+ # e.g. DeIMEMO of AST (before)
720
+
721
+ ```c
722
+ typedef struct rb_ast_struct {
723
+ VALUE flags; // To be removed
724
+ node_buffer_t *node_buffer;
725
+ struct {
726
+ const NODE *root;
727
+ VALUE *script_lines; // To be deVALUEd
728
+ signed int frozen_string_literal:2;
729
+ signed int coverage_enabled:2;
730
+ } body;
731
+ } rb_ast_t;
732
+ ```
733
+
734
+ {:.center}
735
+ {::tag name="x-large"}*Five words*{:/tag}
736
+
737
+ # e.g. DeIMEMO of AST (after)
738
+
739
+ ```c
740
+ typedef struct rb_ast_struct {
741
+ node_buffer_t *node_buffer;
742
+ struct {
743
+ const NODE *root;
744
+ rb_parser_ary_t *script_lines;
745
+ int line_count; // Separeted from script_lines
746
+ signed int frozen_string_literal:2;
747
+ signed int coverage_enabled:2;
748
+ } body;
749
+ #ifdef UNIVERSAL_PARSER
750
+ const rb_parser_config_t *config; // Moved from node_buffer_t
751
+ #endif
752
+ } rb_ast_t;
753
+ ```
754
+
755
+ {:.center}
756
+ {::tag name="x-large"}*πŸ€”...Still five words?*{:/tag}
757
+
758
+ # e.g. DeIMEMO of AST (after)
759
+
760
+ ```c
761
+ typedef struct rb_ast_struct {
762
+ node_buffer_t *node_buffer;
763
+ struct {
764
+ const NODE *root;
765
+ rb_parser_ary_t *script_lines;
766
+ int line_count; // One word in 32 bit
767
+ signed int frozen_string_literal:2;
768
+ signed int coverage_enabled:2;
769
+ } body;
770
+ #ifdef UNIVERSAL_PARSER
771
+ const rb_parser_config_t *config; // Moved from node_buffer_t
772
+ #endif
773
+ } rb_ast_t;
774
+ ```
775
+
776
+ {:.center}
777
+ {::tag name="x-large"}*Six words in 32 bitπŸ’‘*{:/tag}
778
+
779
+ # DeGC
780
+
781
+ {:.center}
782
+ γ€€
783
+ γ€€
784
+ γ€€
785
+ {::tag name="x-large"}DeVALUE and DeIMEMO{:/tag}
786
+ {::tag name="x-large"}automatically complete *DeGC*😊{:/tag}
787
+
788
+ # Amount of diff in ruby/ruby by me
789
+
790
+ ```bash
791
+ $ git log --author=hasumikin \
792
+ --since=2024-01-01 --until=2024-4-30 \
793
+ --oneline
794
+
795
+ ddd8da4b6b [Universal parser] Improve AST structure
796
+ 9ea77cb351 Remove unnecessary assignment to ast->body.line_count
797
+ 55a402bb75 Add line_count field to rb_ast_body_t
798
+ 2244c58b00 [Universal parser] Decouple IMEMO from rb_ast_t
799
+ 9b1e97b211 [Universal parser] DeVALUE of p->debug_lines and ast...
800
+ f5e387a300 Separate SCRIPT_LINES__ from ast.c
801
+ 8aa8fce320 Fix return-type warning in compile.c
802
+ ce544f8dbd [ruby/prism] [Compatibility] Improve printf format
803
+ 40ecad0ad7 [Universal parser] Fix -Wsuggest-attribute=format warnings
804
+ 9a19cfd4cd [Universal Parser] Reduce dependence on RArray in parse.y
805
+ b95e2cdca7 [ruby/prism] Additional fix of adding `x` prefix after ...
806
+ 54f27549e2 [ruby/prism] Chage some names
807
+ c4bd6da298 [ruby/prism] Make alloc interface replaceable
808
+ f0a46c6334 [ruby/prism] Include unistd.h before cheching _POSIX_...
809
+ 15b53e901c [ruby/prism] Use `_POSIX_MAPPED_FILES` and `_WIN32` to ...
810
+ ```
811
+
812
+ # Amount of diff in ruby/ruby by me
813
+
814
+ ```bash
815
+ $ git log --author=hasumikin \
816
+ --since=2024-01-01 --until=2024-4-30 \
817
+ --pretty=tformat: --numstat \
818
+ | awk '{add += $1; del += $2} END
819
+ {print "++:", add, "\n--:", del}'
820
+
821
+ ++: 1321
822
+ --: 818
823
+ ```
824
+
825
+ {:.center}
826
+ Despite 2000+ lines of changes,
827
+ CRuby does not alter its behavior.
828
+
829
+ # Let's try *`valgrind --tool=massif`* again🀞
830
+
831
+ ```c
832
+ mrc_irep *
833
+ compile_ruby_to_mruby_vm_code_with_lramagen(mrc_ccontext *c)
834
+ {
835
+ // ruby_init(); πŸ‘ˆ Remove this
836
+ rb_parser_t *parser = rb_ruby_parser_new();
837
+ rb_ast_t *ast = rb_ruby_parser_compile_string(
838
+ parser, "", "puts 'Hello, World!'", 0);
839
+ mrc_irep *irep = mrc_load_exec(c, (mrc_node *)ast->body.root);
840
+ // ruby_finalize(); πŸ‘ˆ Remove this
841
+ return irep;
842
+ }
843
+ ```
844
+
845
+ {:.center}
846
+ {::tag name="xx-small"}(For illustrative purposes only and non-functional){:/tag}
847
+
848
+ {:.center}
849
+ Now we can remove
850
+ *`ruby_init()`* and *`ruby_finalize()`*
851
+
852
+ # RAM consumption of Lrama-gen parser
853
+
854
+ ```bash
855
+ --------------------------------------------------------------------------------
856
+ Command: bin/mrbc_lrama ../mruby-compiler2/fixtures/hello_world.rb
857
+ Massif arguments: --stacks=yes
858
+ ms_print arguments: massif.out.26697
859
+ --------------------------------------------------------------------------------
860
+ KB
861
+ 39.91^ #
862
+ | @#:
863
+ | @#:
864
+ | @#:
865
+ | @#:
866
+ | @#:
867
+ | @#:
868
+ | @#:
869
+ | @#:
870
+ | @#:
871
+ | @#:
872
+ | @#:
873
+ | @#:
874
+ | @#:
875
+ | @#:
876
+ | @#:
877
+ | @#:
878
+ | @@ @#:
879
+ | @ : @#:
880
+ | @ :::: :@#:
881
+ 0 +----------------------------------------------------------------------->ki
882
+ 0 747.9
883
+ ```
884
+
885
+ # RAM consumption of Lrama-gen parser
886
+
887
+ |Compiler|RAM consumption|
888
+ |---------|------------------|
889
+ |mrbc (original)|129.5 KB|
890
+ |picorbc|16.70 KB|
891
+ |mrbc_prism|9.219 KB|
892
+ |mrbc_lrama|9.985 MB πŸ‘‰ *39.91 KB*πŸŽ‰|
893
+
894
+ {:.center}
895
+ γ€€
896
+ {::tag name="x-large"}*96% reduction!!!*{:/tag}
897
+ (Ready to fight with Prism)
898
+
899
+ # Finally, DeGCed!!!
900
+
901
+ ![](images/dependencies-1.png){:width="1700"}
902
+
903
+ # Remaining issues of Lrama-gen parser
904
+
905
+ - Make NODE(AST) design compatible with Prism
906
+ - Fine-tuning of the RAM consumption
907
+ - 39.91 KB is the result of just *deGCed*
908
+ - There is still more that can be doneπŸ’ͺ
909
+ - *ROM size* and *cross compilation*
910
+
911
+ # ROM size also matters
912
+
913
+ ```bash
914
+ $ ls -lSr picoruby/bin/
915
+ -rwxr-xr-x 851648 picorbc
916
+ -rwxr-xr-x 2512528 mrbc_prism
917
+ -rwxr-xr-x 3386824 mrbc # original
918
+ -rwxr-xr-x 25080800 mrbc_lrama # 24 MB😡
919
+ ```
920
+
921
+ - picorbc solely consists of mruby compiler {::tag name="xx-small"}(ascii only){:/tag}
922
+ - mrbc_prism contains all encodings {::tag name="x-small"}(can be omitted){:/tag}
923
+ - mrbc contains mruby-core
924
+ - mrbc_lrama contains CRuby (miniruby)
925
+
926
+ # mrbc_lrama: 24 MB (on x86-64)
927
+
928
+ ![](images/dependencies-2.png){:width="1700"}
929
+
930
+ # To fix, we need cross compilation
931
+
932
+ ```bash
933
+ $ cd ruby
934
+ $ ./autogen.sh
935
+ $ ./configure
936
+ #=> .ext/include/x86_64-linux/ruby/config.h
937
+
938
+ $ ./configure --host=arm-linux-eabi
939
+ #=> .ext/include/arm-linux-eabi/ruby/config.h
940
+ ```
941
+
942
+ {:.center}
943
+ γ€€
944
+ {::tag name="x-large"}*What is `ruby/config.h`?*{:/tag}
945
+
946
+ # *`ruby/config.h`* is platform-specific config
947
+
948
+ ```c
949
+ // .ext/include/arm-linux-eabi/ruby/config.h
950
+ ...
951
+ #define RUBY_ABI_VERSION 0
952
+ #define HAVE_STDIO_H 1
953
+ ...
954
+ #define SIZEOF_INT 4
955
+ #define SIZEOF_SHORT 2
956
+ ...
957
+ #define SIZEOF_VOIDP 4 // πŸ‘ˆ 32 bit
958
+ #define SIZEOF_FLOAT 4
959
+ #define SIZEOF_DOUBLE 8
960
+ ...
961
+ ```
962
+
963
+ # Target triplets {::tag name="x-small"}[CPU]-[ambiguous]-[ambiguous]{:/tag}
964
+
965
+ - x86-64-linux-gnu:
966
+ - x86 64 bit Linux with GNU libc
967
+ - aarch64-linux-gnu:
968
+ - Arm 64 bit Linux with GNU libc
969
+ - arm-linux-gnueabi(hf):
970
+ - Arm 32 bit Linux with GNU libc for EABI (hard-float)
971
+ EABI: Embedded Application Binary Interface
972
+ - arm-none-eabi:
973
+ - Arm 32 bit *bare-metal* with newlib
974
+
975
+ # arm-none-eabi
976
+
977
+ - RP2040 (Raspberry Pi Pico)
978
+ - 133 MHz Arm Cortex-M0+ (dual)
979
+
980
+ ![](images/rpi_pico.jpg){:align="right" height="800"}
981
+
982
+ # Configurable with arm-none-eabi? *No*
983
+
984
+ ```bash
985
+ $ cd ruby
986
+ $ ./autogen.sh #=> configure
987
+ $ ./configure --host=arm-none-eabi
988
+ #=> error😒😒😒😒😒😒
989
+ ```
990
+
991
+ {:.center}
992
+ γ€€
993
+ {::tag name="x-large"}*Why?*{:/tag}
994
+ Because "none-eabi" doesn't support
995
+ things like *`stdio.h`* that CRuby requires
996
+
997
+ # The next step is to make an effective build
998
+
999
+ - Configurable against bare-metal
1000
+ - Arm, ESP32, and PIC32, etc.
1001
+ - Compile only what we need instead of making libruby-static.a
1002
+ - `parse.o` + `node.o` + `parser_st.o` + *`?.o`*
1003
+ - Resolving the lack of dependencies is another challenge
1004
+
1005
+ # Still some deVALUE to do
1006
+
1007
+ ![](images/dependencies-3.png){:width="1700"}
1008
+
1009
+ # Functions to be deVALUEd
1010
+
1011
+ ```c
1012
+ VALUE rb_node_file_path_val(const NODE *node);
1013
+ VALUE rb_node_str_string_val(const NODE *node);
1014
+ VALUE rb_node_line_lineno_val(const NODE *node);
1015
+ VALUE rb_node_encoding_val(const NODE *node);
1016
+ VALUE rb_node_integer_literal_val(const NODE *n);
1017
+ VALUE rb_node_float_literal_val(const NODE *n);
1018
+ VALUE rb_node_rational_literal_val(const NODE *n);
1019
+ VALUE rb_node_imaginary_literal_val(const NODE *n);
1020
+ VALUE rb_node_regx_string_val(const NODE *node);
1021
+ VALUE rb_node_sym_string_val(const NODE *node);
1022
+ VALUE rb_str_new_parser_string(rb_parser_string_t *str);
1023
+ VALUE rb_str_new_mutable_parser_string(rb_parser_string_t *str);
1024
+ VALUE rb_sym2id(VALUE sym);
1025
+ ```
1026
+
1027
+ {:.center}
1028
+ Further deVALUE is needed
1029
+
1030
+ # Build with dummy functions (PoC)
1031
+
1032
+ ```bash
1033
+ # Linking libruby-static.a
1034
+ -rwxr-xr-x 2512528 mrbc_prism
1035
+ -rwxr-xr-x 25080800 mrbc_lrama 😡
1036
+ πŸ‘‡
1037
+ # Linking dummy functions
1038
+ -rwxr-xr-x 2512528 mrbc_prism
1039
+ -rwxr-xr-x 1927440 mrbc_lrama πŸ₯³
1040
+ ```
1041
+
1042
+ {:.center}
1043
+ Now mrbc_lrama is barely able to compile
1044
+ *`puts 'Hello, World!'`*
1045
+
1046
+ # hide-title
1047
+
1048
+ {:.center}
1049
+ {::tag name="large"}*Working in progress...*{:/tag}
1050
+ γ€€
1051
+ γ€€
1052
+ {::tag name="small"}`https://github.com/picoruby/mruby-compiler2`{:/tag}
1053
+ γ€€
1054
+ {::tag name="small"}`https://github.com/picoruby/mruby-bin-mrbc2`{:/tag}
1055
+
1056
+ ## prop
1057
+
1058
+ hide-title
1059
+ : true
1060
+
1061
+ # FYI, PicoRuby talk at #rubykaigiC 16:40
1062
+
1063
+ ![](images/s01.png){:width="1500" relative_margin_top="5"}
1064
+
1065
+ # Also, PicoRuby appears in Lightning Talks
1066
+
1067
+ ![](images/hachi.png){:width="1500" relative_margin_top="5"}
1068
+
1069
+ # This guy may give you a Raspi Pico
1070
+
1071
+ ![](images/yancya.png){:height="700" relative_margin_top="0"}
1072
+
1073
+ {:.center}
1074
+ {::tag name="xx-small"}https://twitter.com/hanachin_/status/1790713159906681259{:/tag}
1075
+ *Catch him at the venue*
1076
+
1077
+ # Lrama-gen univ-parser is almost there!!!
1078
+
1079
+ - Embed libruby-static.a in PicoRuby compiler
1080
+ - β†’ RAM consumption: *10 MB* 😨
1081
+ - β†’ ROM size: *24 MB* 😡
1082
+ - β†’ DeVALUE, DeIMEMO, DeGC
1083
+ - β†’ RAM consumption: *40 KB* πŸ₯³
1084
+ - β†’ Improve build process
1085
+ - β†’ ROM size: *1.9 MB* 😊
1086
+
1087
+ # Wrap-up
1088
+
1089
+ - PicoRuby compiler will integrate
1090
+ both Prism and Lrama-gen parser
1091
+ - Rest of the work
1092
+ - Cross compile
1093
+ - DeVALUE (a little more)
1094
+ - Symbol table (ID)
1095
+ - Fine-tuning
1096
+
1097
+ ![](images/QR_github-com-picoruby-picoruby.png){:
1098
+ align="right" width="600" relative_margin_top="3" relative_margin_left="0"
1099
+ draw0="[text, github.com/picoruby/picoruby, -0.14, 1.01, {color: black, size: 34, font_family: 'Courier Prime'\}]"
1100
+ }
1101
+
1102
+
1103
+