rip-lang 2.9.2 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/docs/INTERNALS.md DELETED
@@ -1,857 +0,0 @@
1
- <p><img src="rip.svg" alt="Rip Logo" width="100"></p>
2
-
3
- # Rip Compiler Internals
4
-
5
- > Technical reference for the Rip compiler architecture, code generation, and parsing.
6
-
7
- **Version:** 2.5.1
8
- **Test Coverage:** 1046/1046 rip tests (100%) ✅
9
- **Status:** Stable & Production Ready - Self-Hosting Complete
10
-
11
- ---
12
-
13
- ## Table of Contents
14
-
15
- 1. [Architecture Overview](#1-architecture-overview)
16
- 2. [S-Expressions](#2-s-expressions)
17
- 3. [Code Generation](#3-code-generation)
18
- 4. [Comprehensions](#4-comprehensions)
19
- 5. [String Token Processing](#5-string-token-processing)
20
- 6. [Solar Parser Generator](#6-solar-parser-generator)
21
-
22
- ---
23
-
24
- # 1. Architecture Overview
25
-
26
- ## The Pipeline
27
-
28
- ```
29
- Source Code → CoffeeScript Lexer → Solar Parser → S-Expressions → Codegen → JavaScript
30
- (3,537 LOC) (363 LOC) (arrays!) (7,965 LOC)
31
- 15 years tested Generated! Clean IR! Complete!
32
- ```
33
-
34
- ## Key Files
35
-
36
- | File | Purpose | Size | Modify? |
37
- |------|---------|------|---------|
38
- | `src/grammar/grammar.rip` | Grammar spec | 872 LOC | ✅ Yes |
39
- | `src/grammar/solar.rip` | Parser generator | 1,001 LOC | ❌ No |
40
- | `src/parser.js` | Generated parser | 363 LOC | ❌ No (auto-gen) |
41
- | `src/lexer.js` | Lexer + Rewriter | 3,537 LOC | ⚠️ Rewriter only |
42
- | `src/compiler.js` | Code generator | 7,965 LOC | ✅ Yes |
43
-
44
- ## Example Flow
45
-
46
- ```coffee
47
- # Input
48
- x = 42
49
-
50
- # Tokens (from lexer)
51
- [["IDENTIFIER", "x"], ["=", "="], ["NUMBER", "42"]]
52
-
53
- # S-Expression (from parser)
54
- ["program", ["=", "x", 42]]
55
-
56
- # Generated Code (from codegen)
57
- "x = 42;"
58
- ```
59
-
60
- ---
61
-
62
- # 2. S-Expressions
63
-
64
- ## What are S-Expressions?
65
-
66
- S-expressions are simple arrays that serve as Rip's intermediate representation (IR):
67
-
68
- ```javascript
69
- // Traditional AST (CoffeeScript, TypeScript, Babel)
70
- class BinaryOp {
71
- constructor(op, left, right) { ... }
72
- compile() { /* 50+ lines */ }
73
- }
74
-
75
- // Rip's S-Expression
76
- ["+", left, right] // That's it!
77
- ```
78
-
79
- **Result:** CoffeeScript's compiler is 17,760 LOC. Rip's is ~11,000 LOC—**smaller, yet includes a full reactive runtime.**
80
-
81
- ## S-Expression Structure
82
-
83
- - **Head:** String identifying node type (`"if"`, `"def"`, `"+"`, etc.)
84
- - **Rest:** Arguments/children for that node
85
-
86
- **Examples:**
87
- ```javascript
88
- // Assignment
89
- ['=', 'x', 42]
90
-
91
- // Function call
92
- ['add', 5, 10]
93
-
94
- // Binary operator
95
- ['+', 'a', 'b']
96
-
97
- // Nested
98
- ['=', 'result', ['+', ['*', 2, 3], 4]]
99
- ```
100
-
101
- ## Why S-Expressions?
102
-
103
- **Transform the IR (s-expressions), not the output (strings)**
104
-
105
- This single principle makes the code:
106
- - **50% smaller** (despite more explicit logic)
107
- - **Type-safe** (no silent failures)
108
- - **Fast** (no parsing overhead)
109
- - **Clear** (readable data structures)
110
- - **Maintainable** (obvious intent)
111
-
112
- ### Debugging Comparison
113
-
114
- **String Manipulation:**
115
- ```javascript
116
- console.log(iifeCode);
117
- // "(() => {\n const result = [];\n for..."
118
- // What you see: A blob of text
119
- // Time to debug: 20-30 minutes
120
- ```
121
-
122
- **S-Expression:**
123
- ```javascript
124
- console.log(sexpr);
125
- // ["comprehension", ["*", "x", 2], [["for-in", ["x"], ["array", 1, 2, 3], null]], []]
126
- // What you see: Clear structure
127
- // Time to debug: 2-3 minutes
128
- ```
129
-
130
- ## Complete Node Type Reference
131
-
132
- ### Top Level
133
- ```javascript
134
- ['program', ...statements]
135
- ```
136
-
137
- ### Variables & Assignment
138
- ```javascript
139
- ['=', target, value]
140
- ['+=', target, value] // And all compound assigns: -=, *=, /=, %=, **=
141
- ['&&=', target, value]
142
- ['||=', target, value]
143
- ['?=', target, value] // Maps to ??=
144
- ['??=', target, value]
145
- ```
146
-
147
- ### Functions
148
- ```javascript
149
- // Named function
150
- ['def', name, params, body]
151
-
152
- // Thin arrow (unbound this)
153
- ['->', params, body]
154
-
155
- // Fat arrow (bound this)
156
- ['=>', params, body]
157
-
158
- // Parameters can be:
159
- 'name' // Simple param
160
- ['rest', 'name'] // Rest: ...name
161
- ['default', 'name', expr] // Default: name = expr
162
- ['expansion'] // Expansion marker: (a, ..., b)
163
- ['object', ...] // Object destructuring
164
- ['array', ...] // Array destructuring
165
- ```
166
-
167
- ### Calls & Property Access
168
- ```javascript
169
- [callee, ...args] // Function call
170
- ['await', expr] // Await
171
- ['.', obj, 'prop'] // Property: obj.prop
172
- ['?.', obj, 'prop'] // Optional: obj?.prop
173
- ['::', obj, 'prop'] // Prototype: obj.prototype.prop
174
- ['?::', obj, 'prop'] // Soak prototype
175
- ['[]', arr, index] // Index: arr[index]
176
- ['?[]', arr, index] // Soak index
177
- ['optindex', arr, index] // ES6 optional: arr?.[index]
178
- ['optcall', fn, ...args] // ES6 optional: fn?.(args)
179
- ['?call', fn, ...args] // Soak call: fn?(args)
180
- ['new', constructorExpr] // Constructor
181
- ['super', ...args] // Super call
182
- ['tagged-template', tag, str] // Tagged template
183
- ```
184
-
185
- ### Data Structures
186
- ```javascript
187
- ['array', ...elements] // Array literal
188
- ['object', ...pairs] // Object literal (pairs: [key, value])
189
- ['...', expr] // Spread (unary)
190
- ```
191
-
192
- ### Operators
193
- ```javascript
194
- // Arithmetic
195
- ['+', left, right]
196
- ['-', left, right]
197
- ['*', left, right]
198
- ['/', left, right]
199
- ['%', left, right]
200
- ['**', left, right]
201
-
202
- // Comparison (== compiles to ===)
203
- ['==', left, right]
204
- ['!=', left, right]
205
- ['<', left, right]
206
- ['<=', left, right]
207
- ['>', left, right]
208
- ['>=', left, right]
209
-
210
- // Logical
211
- ['&&', left, right]
212
- ['||', left, right]
213
- ['??', left, right]
214
-
215
- // Unary
216
- ['!', expr]
217
- ['~', expr]
218
- ['-', expr] // Unary minus
219
- ['+', expr] // Unary plus
220
- ['++', expr, isPostfix]
221
- ['--', expr, isPostfix]
222
- ['typeof', expr]
223
- ['delete', expr]
224
-
225
- // Special
226
- ['instanceof', expr, type]
227
- ['?', expr] // Existence check
228
- ```
229
-
230
- ### Control Flow
231
- ```javascript
232
- ['if', condition, thenBlock, elseBlock?]
233
- ['unless', condition, body]
234
- ['?:', condition, thenExpr, elseExpr] // Ternary
235
- ['switch', discriminant, cases, defaultCase?]
236
- ```
237
-
238
- ### Loops
239
- ```javascript
240
- ['for-in', vars, iterable, step?, guard?, body]
241
- ['for-of', vars, object, guard?, body]
242
- ['while', condition, body]
243
- ['until', condition, body]
244
- ['loop', body]
245
- ['break']
246
- ['continue']
247
- ['break-if', condition]
248
- ['continue-if', condition]
249
- ```
250
-
251
- ### Comprehensions
252
- ```javascript
253
- ['comprehension', expr, iterators, guards]
254
- ['object-comprehension', keyExpr, valueExpr, iterators, guards]
255
- ```
256
-
257
- ### Exceptions
258
- ```javascript
259
- ['try', tryBlock, [catchParam, catchBlock]?, finallyBlock?]
260
- ['throw', expr]
261
- ```
262
-
263
- ### Classes
264
- ```javascript
265
- ['class', name, parent?, ...members]
266
- ```
267
-
268
- ### Ranges & Slicing
269
- ```javascript
270
- ['..', from, to] // Inclusive range
271
- ['...', from, to] // Exclusive range
272
- ```
273
-
274
- ### Blocks
275
- ```javascript
276
- ['block', ...statements] // Multiple statements
277
- ['do-iife', expr] // Do expression (IIFE)
278
- ```
279
-
280
- ### Modules
281
- ```javascript
282
- ['import', specifiers, source]
283
- ['export', statement]
284
- ['export-default', expr]
285
- ['export-all', source]
286
- ['export-from', specifiers, source]
287
- ```
288
-
289
- ---
290
-
291
- # 3. Code Generation
292
-
293
- ## Overview
294
-
295
- The compiler (`src/compiler.js`) transforms Rip source code into JavaScript. The CodeGenerator class is a pattern matcher that transforms s-expressions into JavaScript—just switch cases that match array patterns.
296
-
297
- ## Understanding Grammar vs Node Types
298
-
299
- **The Numbers:**
300
- - **91 Grammar Types** - BNF non-terminals (like `Expression`, `Statement`, `Value`)
301
- - **406 Grammar Rules** - Production rules (the `o '...'` lines in grammar.rip)
302
- - **110+ S-Expression Node Types** - Actual output strings (like `"yield"`, `"+"`, `"block"`)
303
-
304
- **Key Insight:** Grammar types are NOT the same as node types!
305
-
306
- - Grammar types are **organizational** (structure the BNF grammar)
307
- - Node types are **concrete output** (what codegen handles)
308
-
309
- ## Key Implementation Features
310
-
311
- ### 1. Context-Aware Generation
312
-
313
- Some patterns generate different code based on usage context:
314
-
315
- ```javascript
316
- generate(sexpr, context = 'statement') {
317
- // context can be 'statement' or 'value'
318
- }
319
- ```
320
-
321
- **Example: Comprehensions**
322
-
323
- ```coffee
324
- # Statement context (result discarded) → Plain loop
325
- console.log x for x in arr
326
- # → for (const x of arr) { console.log(x); }
327
-
328
- # Value context (result used) → IIFE with array building
329
- result = (x * 2 for x in arr)
330
- # → (() => { const result = []; for...; return result; })()
331
- ```
332
-
333
- **Pass 'value' context in:**
334
- - Assignments (right side)
335
- - Return statements
336
- - Function arguments
337
- - Array/object elements
338
- - Ternary branches
339
- - Last statement in function (implicit return)
340
-
341
- ### 2. Variable Scoping System
342
-
343
- **CoffeeScript semantics:** Function-level scoping with closure access
344
-
345
- ```javascript
346
- // Program level
347
- let a, b, fn;
348
-
349
- // Function level - only NEW variables
350
- fn = function() {
351
- let x, y; // New variables
352
- a = 1; // Uses outer 'a' (no redeclaration)
353
- x = 2; // Uses local 'x'
354
- };
355
- ```
356
-
357
- **Implementation:**
358
- - `collectProgramVariables()` - Walks top-level, stops at functions
359
- - `collectFunctionVariables()` - Walks function body, stops at nested functions
360
- - Filters out outer variables (accessed via closure)
361
- - Emits `let` declarations at scope top
362
-
363
- ### 3. Auto-Detection
364
-
365
- **Functions automatically become async or generators:**
366
-
367
- ```coffee
368
- # Contains await → becomes async function
369
- getData = ->
370
- result = await fetch(url)
371
- result.json()
372
- # → async function() { ... }
373
-
374
- # Contains yield → becomes generator
375
- counter = ->
376
- yield 1
377
- yield 2
378
- # → function*() { ... }
379
-
380
- # Contains dammit operator → becomes async
381
- fetchAll = ->
382
- users = getUsers!
383
- posts = getPosts!
384
- # → async function() { ... await getUsers() ... }
385
- ```
386
-
387
- **Detection methods:**
388
- - `containsAwait(sexpr)` - Checks for `await` nodes + dammit operators
389
- - `containsYield(sexpr)` - Checks for `yield` nodes
390
- - Stops at function boundaries (nested functions checked separately)
391
-
392
- ### 4. Dual Optional Syntax
393
-
394
- **Rip supports BOTH CoffeeScript soak AND ES6 optional chaining:**
395
-
396
- | Syntax | Type | Compiles To |
397
- |--------|------|-------------|
398
- | `arr?` | Soak | `(arr != null)` |
399
- | `arr?[0]` | Soak | `(arr != null ? arr[0] : undefined)` |
400
- | `fn?(x)` | Soak | `(typeof fn === 'function' ? fn(x) : undefined)` |
401
- | `obj?.prop` | ES6 | `obj?.prop` |
402
- | `arr?.[0]` | ES6 | `arr?.[0]` |
403
- | `fn?.(x)` | ES6 | `fn?.(x)` |
404
- | `x ?? y` | ES6 | `x ?? y` |
405
- | `a ??= 10` | ES6 | `a ??= 10` |
406
-
407
- **Mix and match:**
408
- ```coffee
409
- obj?.arr?[0] # ES6 + CoffeeScript together!
410
- ```
411
-
412
- ### 5. Range Optimization
413
-
414
- **Traditional for loops instead of wasteful IIFEs:**
415
-
416
- ```coffee
417
- # Optimized to traditional loop
418
- for i in [1...100]
419
- process(i)
420
- # → for (let i = 1; i < 100; i++) { process(i); }
421
-
422
- # Not: (() => { return Array.from(...) })(1, 100).forEach(...)
423
- # Savings: 73% smaller code!
424
- ```
425
-
426
- **Reverse iteration support:**
427
- ```coffee
428
- for i in [10..1] by -1
429
- process(i)
430
- # → for (let i = 10; i >= 1; i--) { process(i); }
431
- ```
432
-
433
- ## Debug Tools
434
-
435
- ```bash
436
- # See tokens from lexer
437
- echo 'x = 42' | ./bin/rip -t
438
-
439
- # See s-expressions from parser
440
- echo 'x = 42' | ./bin/rip -s
441
-
442
- # See generated JavaScript
443
- echo 'x = 42' | ./bin/rip -c
444
-
445
- # Interactive REPL with debug modes
446
- ./bin/rip
447
- rip> .tokens # Toggle token display
448
- rip> .sexp # Toggle s-expression display
449
- rip> .js # Toggle JS display
450
- ```
451
-
452
- ---
453
-
454
- # 4. Comprehensions
455
-
456
- ## Purpose
457
-
458
- Comprehensions can act as **data builders** or **control loops**. Rip distinguishes these automatically based on surrounding context, improving on CoffeeScript's behavior with smarter optimizations.
459
-
460
- **Key insight:** At parse time, all comprehensions produce identical s-expressions. Context resolution happens only during code generation.
461
-
462
- ## Rip vs CoffeeScript Design Philosophy
463
-
464
- **CoffeeScript (syntax-based):**
465
- - `for x in xs then f(x)` → plain loop (statement-style)
466
- - `for x in xs` (multi-line) → IIFE (expression-style, always returns array)
467
- - **Problem:** Multi-line form builds wasteful arrays even when result unused
468
-
469
- **Rip (context-based):**
470
- - Same syntax, different output based on **how value is used**
471
- - `for x in xs` → IIFE if assigned/returned, plain loop if result discarded
472
- - **Benefit:** Automatic optimization - no wasteful array building!
473
-
474
- **Example of improvement:**
475
- ```coffee
476
- fn = ->
477
- process x for x in arr # ← Rip: plain loop! CS: IIFE (wasteful)
478
- doMore()
479
- ```
480
-
481
- ## The Core Rule
482
-
483
- Comprehensions have **dual semantics** based on how their value is used:
484
-
485
- | Context | Generates | Example |
486
- |---------|-----------|---------|
487
- | **Value Context** | IIFE (builds array) | `x = (n*2 for n in arr)` |
488
- | **Statement Context** | Plain loop (side effects) | `process(n) for n in arr; other()` |
489
-
490
- **Critical principle:** Context is **downward-propagating**. Parent nodes decide if children need values.
491
-
492
- ## Context Propagation Patterns
493
-
494
- | Parent Node | Child | Context | Reason |
495
- |-------------|-------|---------|---------|
496
- | `Assignment` | RHS | `'value'` | Value assigned to variable |
497
- | `Call` | Arguments | `'value'` | Values passed to function |
498
- | `Return` | Expression | `'value'` | Value returned from function |
499
- | `Function` | Last statement | `'value'` | Implicit return |
500
- | `Function` | Non-last statements | `'statement'` | Result discarded |
501
- | `Loop` | Body (all statements) | `'statement'` | Loops don't return values |
502
- | `If/Unless` | Branches | **Inherit parent** | Pass through context |
503
- | `Array` | Elements | `'value'` | Values stored in array |
504
- | `Object` | Values | `'value'` | Values stored in object |
505
- | `Ternary` | Both branches | `'value'` | Both branches produce values |
506
-
507
- ## Value Context (Builds Array)
508
-
509
- When parent demands a value, comprehension generates an IIFE:
510
-
511
- ```javascript
512
- (() => {
513
- const result = [];
514
- for (const x of arr) { result.push(x * 2); }
515
- return result;
516
- })()
517
- ```
518
-
519
- **IIFE form guarantees:**
520
- - ✅ No variable leakage (lexically scoped)
521
- - ✅ Always returns array (even `[]` for empty input)
522
- - ✅ Single evaluation of iterator source
523
-
524
- ## Statement Context (Plain Loop)
525
-
526
- When value is discarded, generate an efficient plain loop:
527
-
528
- ```javascript
529
- for (const x of arr) {
530
- process(x); // Side effects only
531
- }
532
- ```
533
-
534
- **Plain loop form guarantees:**
535
- - ✅ No array allocation (efficient)
536
- - ✅ Supports `break`/`continue`
537
- - ✅ Loop variables scoped with `const`/`let`
538
-
539
- ## Edge Cases
540
-
541
- ### Own + Guard + Value Variable (Critical!)
542
-
543
- ```coffee
544
- for own k, v of obj when v > 5
545
- process(k, v)
546
- ```
547
-
548
- **Must generate:**
549
- ```javascript
550
- for (const k in obj) {
551
- if (obj.hasOwnProperty(k)) { // 1. Own check FIRST
552
- const v = obj[k]; // 2. Assign value SECOND
553
- if (v > 5) { // 3. Guard check THIRD (uses v!)
554
- process(k, v);
555
- }
556
- }
557
- }
558
- ```
559
-
560
- **Bug to avoid:** Never check guard before value is defined!
561
-
562
- ### Async Comprehensions
563
-
564
- **Sequential (await in body):**
565
- ```coffee
566
- # IIFE is async, awaits happen serially inside loop
567
- results = (await fetchData(url) for url in urls)
568
- ```
569
-
570
- **Parallel (recommended for I/O):**
571
- ```coffee
572
- # Build array of promises, then await all in parallel
573
- results = await Promise.all (fetchData(url) for url in urls)
574
- ```
575
-
576
- ---
577
-
578
- # 5. String Token Processing
579
-
580
- ## Overview
581
-
582
- In CoffeeScript's lexer and parser, the STRING token carries several metadata properties that are essential for correctly transforming source strings into JavaScript output.
583
-
584
- ## The STRING Token Structure
585
-
586
- A STRING token is not just a simple string value. It's an object with these properties:
587
- - The string content itself
588
- - `quote`: The quote delimiter used
589
- - `initialChunk`: Boolean flag for first chunk in interpolated string
590
- - `finalChunk`: Boolean flag for last chunk in interpolated string
591
- - `indent`: The common indentation to strip from heredocs
592
- - `double`: Whether backslashes should be doubled in output
593
- - `heregex`: Object with regex flags for extended regex literals
594
-
595
- ## Property Descriptions
596
-
597
- ### 1. `quote` - Quote Delimiter Type
598
-
599
- **Purpose:** Records which quote characters were used to delimit the string in source code.
600
-
601
- **Possible Values:**
602
- - `"` - double quote
603
- - `'` - single quote
604
- - `"""` - triple double quote (heredoc)
605
- - `'''` - triple single quote (heredoc)
606
- - `"///"` - heregex (extended regex literal)
607
-
608
- **How It's Used:**
609
- - Detects heredocs: `heredoc = @quote.length is 3`
610
- - Adjusts location data for source maps
611
-
612
- ### 2. `initialChunk` / `finalChunk` - Interpolated String Chunks
613
-
614
- **Purpose:** Marks whether this STRING token is the first/last chunk in an interpolated string.
615
-
616
- **How It's Used:**
617
- - If `initialChunk` is true, `LEADING_BLANK_LINE` regex strips the leading blank line in heredocs
618
- - If `finalChunk` is true, `TRAILING_BLANK_LINE` regex strips the trailing blank line
619
-
620
- ### 3. `indent` - Common Indentation to Strip
621
-
622
- **Purpose:** Records the common leading whitespace found across all lines of a heredoc that should be stripped during compilation.
623
-
624
- **Example:**
625
- ```coffeescript
626
- x = """
627
- Line 1
628
- Line 2
629
- Indented more
630
- """
631
- # indent would be " " (4 spaces)
632
- # After processing:
633
- # "Line 1\nLine 2\n Indented more"
634
- ```
635
-
636
- ### 4. `double` - Backslash Doubling Flag
637
-
638
- **Purpose:** Indicates whether backslash characters in the string should be doubled when generating JavaScript output.
639
-
640
- ### 5. `heregex` - Extended Regex Metadata
641
-
642
- **Purpose:** Contains metadata about extended regular expression (heregex) literals, including flags.
643
-
644
- **Example:**
645
- ```coffeescript
646
- pattern = ///
647
- ^ \d+ # starts with digits
648
- \s* # optional whitespace
649
- [a-z]+ # followed by letters
650
- $ # end of string
651
- ///i # case-insensitive flag
652
-
653
- # heregex: { flags: 'i' }
654
- # The whitespace and comments are stripped
655
- ```
656
-
657
- ## REGEX Tokens
658
-
659
- While STRING tokens have properties directly on the String object, **REGEX tokens store metadata in `token.data`**:
660
-
661
- ```javascript
662
- // REGEX token structure:
663
- token = ['REGEX', String("/pattern/flags"), location]
664
- token.data = {
665
- delimiter: '///', // '/' for normal regex, '///' for heregex
666
- heregex: {flags: 'gi'} // Only for heregex
667
- }
668
- ```
669
-
670
- ---
671
-
672
- # 6. Solar Parser Generator
673
-
674
- **Solar** is a complete SLR(1) parser generator **included with Rip** - written in Rip, compiled by Rip, zero external dependencies!
675
-
676
- **Location:** `src/grammar/solar.rip` (~1,000 lines)
677
- **Dependencies:** ZERO - Self-hosting, standalone
678
- **Type:** SLR(1) parser generator (similar to Yacc/Bison/Jison)
679
-
680
- ## What is Solar?
681
-
682
- Solar is an SLR(1) parser generator that generates parsers from grammar specifications. Rip uses Solar's **s-expression mode** to generate parsers that output simple array-based s-expressions instead of traditional AST nodes.
683
-
684
- **Key Innovation:** S-expressions as intermediate representation reduces compiler complexity by ~35%.
685
-
686
- **Unique Advantage:** Unlike most languages that depend on external parser generators (Yacc, Bison, Jison), **Rip includes its own parser generator** written in Rip itself! This makes Rip completely self-hosting with zero dependencies.
687
-
688
- ## Enabling S-Expression Mode
689
-
690
- In `src/grammar/grammar.rip`:
691
-
692
- ```coffeescript
693
- mode = 'sexp' # Enable s-expression output mode
694
- ```
695
-
696
- This tells Solar to generate a parser that builds s-expressions (nested arrays) instead of AST objects.
697
-
698
- ## Grammar Syntax
699
-
700
- ### Helper Function
701
-
702
- ```coffeescript
703
- o = (pattern, action, options) ->
704
- pattern = pattern.trim().replace /\s{2,}/g, ' '
705
- [pattern, action ? 1, options]
706
- ```
707
-
708
- ### Action Syntax - Three Styles
709
-
710
- #### Style 1: Default (Pass-Through)
711
-
712
- **When:** Omit action parameter (defaults to `1`)
713
- **Behavior:** Returns first token
714
-
715
- ```coffeescript
716
- Expression: [
717
- o 'Value' # Returns Value (position 1)
718
- o 'Operation' # Returns Operation (position 1)
719
- ]
720
- ```
721
-
722
- #### Style 2: Simple S-Expression (Bare Numbers)
723
-
724
- **When:** Action string contains **no `$` references**
725
- **Behavior:** All bare numbers become `$$[$n]` token references
726
-
727
- ```coffeescript
728
- For: [
729
- o 'FOR ForVariables FOROF Expression Block', '["for-of", 2, 4, null, 5]'
730
- ]
731
- ```
732
-
733
- **Token positions:**
734
- - `FOR` (1), `ForVariables` (2), `FOROF` (3), `Expression` (4), `Block` (5)
735
-
736
- **Use for:** Most grammar rules - clean and simple!
737
-
738
- #### Style 3: Advanced (Dollar References)
739
-
740
- **When:** Action string contains `$n` patterns
741
- **Behavior:** Only `$n` replaced; bare numbers preserved as literals
742
-
743
- ```coffeescript
744
- Parenthetical: [
745
- o '( Body )', '$2.length === 1 ? $2[0] : $2'
746
- ]
747
- ```
748
-
749
- **Use for:** Conditional logic, array access, transformations
750
-
751
- ### Spread Operator in Actions
752
-
753
- Spread arrays into parent array:
754
-
755
- ```coffeescript
756
- Body: [
757
- o 'Line', '[1]' # Wrap: [Line]
758
- o 'Body TERMINATOR Line', '[...1, 3]' # Spread: [...Body, Line]
759
- ]
760
- ```
761
-
762
- ## Performance
763
-
764
- ### Parser Generation Speed
765
-
766
- **Solar generates Rip's parser in ~50ms!**
767
-
768
- **Real-world benchmark (Rip grammar):**
769
- - **Grammar size:** 91 types, 406 production rules
770
- - **Generated parser:** 250 states, SLR(1) parse table
771
- - **Solar:** ~50ms total
772
- - **Jison:** ~12,500ms (12.5 seconds)
773
- - **Speedup:** **156× faster!**
774
-
775
- ### Why Solar is So Fast
776
-
777
- 1. **Optimized Algorithms:**
778
- - Single-pass item grouping (no redundant scanning)
779
- - Efficient kernel signature computation
780
- - Direct state map lookups
781
- - Minimal object allocations
782
-
783
- 2. **Clean Implementation:**
784
- - No intermediate representations
785
- - Direct Map/Set usage (V8 optimized)
786
- - Simple data structures (arrays, not classes)
787
-
788
- ### Comparison with Jison
789
-
790
- | Metric | Jison | Solar | Winner |
791
- |--------|-------|-------|--------|
792
- | **Parse time** | 12,500ms | ~50ms | **Solar 250×** |
793
- | **Dependencies** | Many | Zero | **Solar** |
794
- | **Self-hosting** | No | Yes | **Solar** |
795
- | **Code size** | 2,285 LOC | ~1,000 LOC | **Solar 56% smaller** |
796
-
797
- ## Working with the Grammar
798
-
799
- ### Regenerate Parser
800
-
801
- After modifying the grammar:
802
-
803
- ```bash
804
- bun run parser
805
- ```
806
-
807
- This regenerates `src/parser.js` (363 LOC, auto-generated).
808
-
809
- ### Example Rule
810
-
811
- ```coffeescript
812
- Assignment: [
813
- o 'Assignable = Expression', '["=", 1, 3]'
814
- o 'Assignable = TERMINATOR Expression', '["=", 1, 4]'
815
- o 'Assignable = INDENT Expression OUTDENT', '["=", 1, 4]'
816
- ]
817
- ```
818
-
819
- **Breakdown:**
820
- - Pattern: `Assignable = Expression`
821
- - Tokens: Position 1 (Assignable), 2 (=), 3 (Expression)
822
- - Action: `'["=", 1, 3]'` becomes `["=", $$[$0-2], $$[$0]]`
823
- - Output: `["=", assignable, expression]`
824
-
825
- ### Debugging Grammar Rules
826
-
827
- ```bash
828
- # See what parser emits
829
- echo 'x = 42' | ./bin/rip -s
830
-
831
- # See tokens
832
- echo 'x = 42' | ./bin/rip -t
833
-
834
- # See generated JavaScript
835
- echo 'x = 42' | ./bin/rip -c
836
- ```
837
-
838
- ---
839
-
840
- ## Summary
841
-
842
- Solar's s-expression mode is the **secret sauce** that makes Rip practical:
843
-
844
- 1. **Simple IR:** Arrays instead of AST classes
845
- 2. **Grammar-driven:** Modify spec, regenerate parser
846
- 3. **Battle-tested:** Built on CoffeeScript's proven lexer
847
- 4. **Maintainable:** ~35% less code than CoffeeScript
848
- 5. **Extensible:** Add features by adding switch cases
849
-
850
- **Result:** A production-ready compiler with reactive framework in ~14,000 LOC—smaller than CoffeeScript's 17,760 LOC while delivering far more!
851
-
852
- ---
853
-
854
- **See Also:**
855
- - [GUIDE.md](GUIDE.md) - Language features and syntax
856
- - [RATIONALE.md](RATIONALE.md) - Design decisions and rationale
857
- - [BROWSER.md](BROWSER.md) - Browser usage and REPL guide