rip-lang 3.13.26 → 3.13.28

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,593 +0,0 @@
1
- <img src="https://raw.githubusercontent.com/shreeve/rip-lang/main/docs/assets/rip.png" style="width:50px" /> <br>
2
-
3
- # Rip Internals
4
-
5
- > Architecture, design decisions, and technical reference for the Rip compiler.
6
-
7
- ---
8
-
9
- ## Table of Contents
10
-
11
- 1. [Why Rip](#1-why-rip)
12
- 2. [Architecture](#2-architecture)
13
- 3. [S-Expressions](#3-s-expressions)
14
- 4. [Lexer & Rewriter](#4-lexer--rewriter)
15
- 5. [Code Generation](#5-code-generation)
16
- 6. [Compiler](#6-compiler)
17
- 7. [Solar Parser Generator](#7-solar-parser-generator)
18
- 8. [Debug Tools](#8-debug-tools)
19
- 9. [Future Work](#9-future-work)
20
-
21
- ---
22
-
23
- # 1. Why Rip
24
-
25
- ## The Short Version
26
-
27
- 1. **Simplicity scales** — S-expressions make compilers 50% smaller and 10x easier to maintain
28
- 2. **Zero dependencies** — True autonomy from the npm ecosystem
29
- 3. **Modern output** — ES2022 everywhere, no legacy baggage
30
- 4. **Reactivity as operators** — `:=`, `~=`, `~>` are language syntax, not library imports
31
- 5. **Self-hosting** — Rip compiles itself, including its own parser generator
32
-
33
- ## Why S-Expressions
34
-
35
- Most compilers use complex AST node classes. Rip uses **simple arrays**:
36
-
37
- ```javascript
38
- // Traditional AST (CoffeeScript, TypeScript, Babel)
39
- class BinaryOp {
40
- constructor(op, left, right) { ... }
41
- compile() { /* 50+ lines */ }
42
- }
43
-
44
- // Rip's S-Expression
45
- ["+", left, right] // That's it!
46
- ```
47
-
48
- **Result:** CoffeeScript's compiler is 17,760 LOC. Rip's is ~10,400 LOC — smaller, yet includes a complete reactive runtime, type system, component system, and source maps.
49
-
50
- > **Transform the IR (s-expressions), not the output (strings).**
51
-
52
- This single principle eliminates entire categories of bugs. When your IR is simple data (arrays), transformations are trivial and debuggable:
53
-
54
- ```javascript
55
- // Debugging: inspect the data directly
56
- console.log(sexpr);
57
- // ["comprehension", ["*", "x", 2], [["for-in", ["x"], ["array", 1, 2, 3]]], []]
58
-
59
- // vs. string manipulation
60
- console.log(code);
61
- // "(() => {\n const result = [];\n for (const x of arr) {\n..."
62
- ```
63
-
64
- ## Rip vs CoffeeScript
65
-
66
- | Feature | CoffeeScript | Rip |
67
- |---------|-------------|------|
68
- | Optional chaining | 4 soak operators | ES6 `?.` / `?.[]` / `?.()` + shorthand `?[]` / `?()` |
69
- | Ternary | No | `x ? a : b` and `a if x else b` |
70
- | Regex features | Basic | Ruby-style (`=~`, indexing, captures in `_`) |
71
- | Async shorthand | No | Dammit operator (`!`) |
72
- | Void functions | No | `def fn!` |
73
- | Reactivity | None | `:=`, `~=`, `~>` |
74
- | Comprehensions | Always IIFE | Context-aware |
75
- | Modules | CommonJS | ES6 |
76
- | Classes | ES5 | ES6 |
77
- | Dependencies | Multiple | **Zero** |
78
- | Parser generator | External (Jison) | **Built-in (Solar)** |
79
- | Self-hosting | No | **Yes** |
80
- | Total LOC | 17,760 | ~10,400 |
81
-
82
- ## Design Principles
83
-
84
- - **Simplicity scales** — Simple IR, clear pipeline, minimal code, comprehensive tests
85
- - **Zero dependencies is a feature** — No supply chain attacks, no version conflicts, no `node_modules` bloat
86
- - **Self-hosting proves quality** — Rip compiles its own parser generator; if it can compile itself, it works
87
-
88
- ---
89
-
90
- # 2. Architecture
91
-
92
- ## The Pipeline
93
-
94
- ```
95
- Source Code → Lexer → emitTypes → Parser → S-Expressions → Codegen → JavaScript
96
- (1,761) (types.js) (359) (arrays + .loc) (3,293) + source map
97
-
98
- file.d.ts (when types: "emit")
99
- ```
100
-
101
- ## Key Files
102
-
103
- | File | Purpose | Lines | Modify? |
104
- |------|---------|-------|---------|
105
- | `src/lexer.js` | Lexer + Rewriter | 1,761 | Yes |
106
- | `src/compiler.js` | Compiler + Code Generator | 3,293 | Yes |
107
- | `src/types.js` | Type System (lexer sidecar) | 1,099 | Yes |
108
- | `src/components.js` | Component System (compiler sidecar) | 1,750 | Yes |
109
- | `src/sourcemaps.js` | Source Map V3 Generator | 189 | Yes |
110
- | `src/tags.js` | HTML Tag Classification | 62 | Yes |
111
- | `src/parser.js` | Generated parser | 359 | No (auto-gen) |
112
- | `src/grammar/grammar.rip` | Grammar specification | 944 | Yes (carefully) |
113
- | `src/grammar/solar.rip` | Parser generator | 929 | No |
114
-
115
- ## Example Flow
116
-
117
- ```coffee
118
- # Input
119
- x = 42
120
-
121
- # Tokens (from lexer)
122
- [["IDENTIFIER", "x"], ["=", "="], ["NUMBER", "42"]]
123
-
124
- # S-Expression (from parser)
125
- ["program", ["=", "x", 42]]
126
-
127
- # Generated Code (from codegen)
128
- "x = 42;"
129
- ```
130
-
131
- ---
132
-
133
- # 3. S-Expressions
134
-
135
- S-expressions are simple arrays that serve as Rip's intermediate representation (IR). Each has a **head** (string identifying node type) and **rest** (arguments/children).
136
-
137
- ## Complete Node Type Reference
138
-
139
- ### Top Level
140
- ```javascript
141
- ['program', ...statements]
142
- ```
143
-
144
- ### Variables & Assignment
145
- ```javascript
146
- ['=', target, value]
147
- ['+=', target, value] // And all compound assigns: -=, *=, /=, %=, **=
148
- ['&&=', target, value]
149
- ['||=', target, value]
150
- ['?=', target, value] // Maps to ??=
151
- ['??=', target, value]
152
- ```
153
-
154
- ### Functions
155
- ```javascript
156
- ['def', name, params, body] // Named function
157
- ['->', params, body] // Thin arrow (unbound this)
158
- ['=>', params, body] // Fat arrow (bound this)
159
-
160
- // Parameters can be:
161
- 'name' // Simple param
162
- ['rest', 'name'] // Rest: ...name
163
- ['default', 'name', expr] // Default: name = expr
164
- ['expansion'] // Expansion marker: (a, ..., b)
165
- ['object', ...] // Object destructuring
166
- ['array', ...] // Array destructuring
167
- ```
168
-
169
- ### Calls & Property Access
170
- ```javascript
171
- [callee, ...args] // Function call
172
- ['await', expr] // Await
173
- ['.', obj, 'prop'] // Property: obj.prop
174
- ['?.', obj, 'prop'] // Optional: obj?.prop
175
- ['[]', arr, index] // Index: arr[index]
176
- ['optindex', arr, index] // Optional: arr?.[index]
177
- ['optcall', fn, ...args] // Optional: fn?.(args)
178
- ['new', constructorExpr] // Constructor
179
- ['super', ...args] // Super call
180
- ['tagged-template', tag, str] // Tagged template
181
- ```
182
-
183
- ### Data Structures
184
- ```javascript
185
- ['array', ...elements] // Array literal
186
- ['object', ...pairs] // Object literal (pairs: [key, value])
187
- ['...', expr] // Spread (prefix only)
188
- ```
189
-
190
- ### Operators
191
- ```javascript
192
- // Arithmetic
193
- ['+', left, right] ['-', left, right] ['*', left, right]
194
- ['/', left, right] ['%', left, right] ['**', left, right]
195
-
196
- // Comparison (== compiles to ===)
197
- ['==', left, right] ['!=', left, right]
198
- ['<', left, right] ['<=', left, right]
199
- ['>', left, right] ['>=', left, right]
200
-
201
- // Logical
202
- ['&&', left, right] ['||', left, right] ['??', left, right]
203
-
204
- // Unary
205
- ['!', expr] ['~', expr] ['-', expr]
206
- ['+', expr] ['typeof', expr] ['delete', expr]
207
- ['++', expr, isPostfix] ['--', expr, isPostfix]
208
-
209
- // Special
210
- ['instanceof', expr, type]
211
- ['?', expr] // Existence check
212
- ```
213
-
214
- ### Control Flow
215
- ```javascript
216
- ['if', condition, thenBlock, elseBlock?]
217
- ['unless', condition, body]
218
- ['?:', condition, thenExpr, elseExpr] // Ternary
219
- ['switch', discriminant, cases, defaultCase?]
220
- ```
221
-
222
- ### Loops
223
- ```javascript
224
- ['for-in', vars, iterable, step?, guard?, body]
225
- ['for-of', vars, object, guard?, body]
226
- ['for-as', vars, iterable, async?, guard?, body] // ES6 for-of on iterables
227
- ['while', condition, body]
228
- ['until', condition, body]
229
- ['loop', body]
230
- ['break'] ['continue']
231
- ['break-if', condition] ['continue-if', condition]
232
- ```
233
-
234
- ### Comprehensions
235
- ```javascript
236
- ['comprehension', expr, iterators, guards]
237
- ['object-comprehension', keyExpr, valueExpr, iterators, guards]
238
- ```
239
-
240
- ### Exceptions
241
- ```javascript
242
- ['try', tryBlock, [catchParam, catchBlock]?, finallyBlock?]
243
- ['throw', expr]
244
- ```
245
-
246
- ### Classes
247
- ```javascript
248
- ['class', name, parent?, ...members]
249
- ```
250
-
251
- ### Types
252
- ```javascript
253
- ['enum', name, body] // Enum declaration (runtime JS)
254
- // Type aliases, interfaces → handled by rewriter, never reach parser
255
- ```
256
-
257
- ### Ranges & Slicing
258
- ```javascript
259
- ['..', from, to] // Inclusive range
260
- ['...', from, to] // Exclusive range
261
- ```
262
-
263
- ### Blocks & Modules
264
- ```javascript
265
- ['block', ...statements] // Multiple statements
266
- ['do-iife', expr] // Do expression (IIFE)
267
- ['import', specifiers, source]
268
- ['export', statement]
269
- ['export-default', expr]
270
- ['export-all', source]
271
- ['export-from', specifiers, source]
272
- ```
273
-
274
- ---
275
-
276
- # 4. Lexer & Rewriter
277
-
278
- The lexer (`src/lexer.js`) is a clean reimplementation that replaces the old lexer (3,260 lines) with ~1,760 lines producing the same token vocabulary the parser expects.
279
-
280
- ## Architecture
281
-
282
- - **9 tokenizers** in priority order: identifier, comment, whitespace, line, string, number, regex, js, literal
283
- - **7 rewriter passes**: removeLeadingNewlines, closeOpenCalls, closeOpenIndexes, normalizeLines, tagPostfixConditionals, addImplicitBracesAndParens, addImplicitCallCommas
284
- - **Token format**: `[tag, val]` array with `.pre`, `.data`, `.loc`, `.spaced`, `.newLine` properties
285
-
286
- ## Token Properties
287
-
288
- | Property | Type | Purpose |
289
- |----------|------|---------|
290
- | `.pre` | number | Whitespace count before this token |
291
- | `.data` | object/null | Metadata: `{await, predicate, quote, invert, parsedValue, ...}` |
292
- | `.loc` | `{r, c, n}` | Row, column, length |
293
- | `.spaced` | boolean | Sugar for `.pre > 0` |
294
- | `.newLine` | boolean | Preceded by a newline |
295
-
296
- ## Identifier Suffixes
297
-
298
- | Suffix | Data flag | Meaning | JS output |
299
- |--------|-----------|---------|-----------|
300
- | `!` | `.data.await = true` | Dammit operator | `await` + base name |
301
- | `?` | `.data.predicate = true` | Existence check | `(expr != null)` |
302
-
303
- The `?` suffix is captured only when NOT followed by `.`, `?`, `[`, or `(` — so `?.` (optional chaining), `??` (nullish coalescing), `?.()`, and `?.[i]` remain unambiguous.
304
-
305
- The `!` suffix on `as` in for-loops (`as!`) emits `FORASAWAIT` instead of `FORAS`, enabling `for x as! iterable` as shorthand for `for await x as iterable`.
306
-
307
- ## Language Changes (3.0 Rewrite)
308
-
309
- ### Removed
310
-
311
- | Feature | Old syntax | Replacement |
312
- |---------|-----------|-------------|
313
- | Postfix spread/rest | `x...` | `...x` (ES6 prefix only) |
314
- | Prototype access | `x::y`, `x?::y` | Direct `.prototype` or class syntax; `::` reserved for type annotations |
315
- | `is not` contraction | `x is not y` | `x isnt y` |
316
-
317
- ### Added
318
-
319
- | Feature | Syntax | Purpose |
320
- |---------|--------|---------|
321
- | `for...as` iteration | `for x as iter` | ES6 `for...of` on iterables (replaces `for x from iter`) |
322
- | `as!` async shorthand | `for x as! iter` | Shorthand for `for await x as iter` |
323
-
324
- ### Changed
325
-
326
- | Item | Old | New |
327
- |------|-----|-----|
328
- | Location data | `locationData` (object) | `.loc = {r, c, n}` |
329
- | `for...from` keyword | `FORFROM` | `FORAS` |
330
- | Token metadata | `new String(val)` with props | `.data` object on token |
331
- | Category arrays | `Array` + `indexOf` | `Set` + `.has()` |
332
- | Variable style | `const`/`let` mix | All `let` |
333
- | Rewriter passes | 13 | 7 |
334
-
335
- ### Preserved
336
-
337
- All 9 tokenizer methods, full token vocabulary, implicit call/object/brace detection, string interpolation with recursive sub-lexing, heredoc indent processing, arrow function parameter tagging, `do` IIFE support, `for own x of obj`, all reactive operators (`:=`, `~=`, `~>`, `=!`), all Rip aliases (`and`, `or`, `is`, `isnt`, `not`, `yes`, `no`, `on`, `off`).
338
-
339
- ---
340
-
341
- # 5. Code Generation
342
-
343
- The compiler (`src/compiler.js`) transforms s-expressions into JavaScript. The `CodeGenerator` class is a dispatch table — s-expression heads map to generator methods.
344
-
345
- ## Context-Aware Generation
346
-
347
- Some patterns generate different code based on usage context:
348
-
349
- ```javascript
350
- generate(sexpr, context = 'statement') {
351
- // context can be 'statement' or 'value'
352
- }
353
- ```
354
-
355
- **Comprehensions** are the primary example:
356
-
357
- ```coffee
358
- # Statement context (result discarded) → Plain loop
359
- console.log x for x in arr
360
-
361
- # Value context (result used) → IIFE with array building
362
- result = (x * 2 for x in arr)
363
- ```
364
-
365
- | Parent Node | Child | Context | Reason |
366
- |-------------|-------|---------|--------|
367
- | Assignment | RHS | `'value'` | Value assigned to variable |
368
- | Call | Arguments | `'value'` | Values passed to function |
369
- | Return | Expression | `'value'` | Value returned from function |
370
- | Function | Last statement | `'value'` | Implicit return |
371
- | Function | Non-last statements | `'statement'` | Result discarded |
372
- | Loop | Body | `'statement'` | Loops don't return values |
373
- | If/Unless | Branches | Inherit parent | Pass through context |
374
- | Array | Elements | `'value'` | Values stored in array |
375
-
376
- ## Variable Scoping
377
-
378
- CoffeeScript semantics: function-level scoping with closure access.
379
-
380
- - `collectProgramVariables()` — Walks top-level, stops at functions
381
- - `collectFunctionVariables()` — Walks function body, stops at nested functions
382
- - Filters out outer variables (accessed via closure)
383
- - Emits `let` declarations at scope top
384
-
385
- ## Auto-Detection
386
-
387
- Functions automatically become async or generators:
388
-
389
- ```coffee
390
- # Contains await or dammit → becomes async
391
- def loadData(id)
392
- user = getUser!(id)
393
- user.posts
394
-
395
- # Contains yield → becomes generator
396
- counter = ->
397
- yield 1
398
- yield 2
399
- ```
400
-
401
- ## Existence Check
402
-
403
- | Syntax | Compiles To |
404
- |--------|-------------|
405
- | `x?` | `(x != null)` |
406
- | `obj.prop?` | `(obj.prop != null)` |
407
- | `x ?? y` | `x ?? y` |
408
- | `x ??= 10` | `x ??= 10` |
409
-
410
- Optional chaining uses ES6 syntax (both forms supported):
411
-
412
- | Syntax | Compiles To |
413
- |--------|-------------|
414
- | `obj?.prop` | `obj?.prop` |
415
- | `arr?.[0]` | `arr?.[0]` |
416
- | `fn?.(x)` | `fn?.(x)` |
417
- | `arr?[0]` | `arr?.[0]` |
418
- | `fn?(x)` | `fn?.(x)` |
419
-
420
- ## Range Optimization
421
-
422
- ```coffee
423
- for i in [1...100]
424
- process(i)
425
- # → for (let i = 1; i < 100; i++) { process(i); }
426
- ```
427
-
428
- ## String & Regex Processing
429
-
430
- String tokens carry metadata in `.data`:
431
- - `quote`: The quote delimiter (`"`, `'`, `"""`, `'''`, `///`)
432
- - `quote.length === 3`: Indicates a heredoc
433
-
434
- Heredocs use the closing delimiter's column position as the baseline for indentation stripping.
435
-
436
- REGEX tokens store `delimiter` and optional `heregex` flags in `token.data`.
437
-
438
- ---
439
-
440
- # 6. Compiler
441
-
442
- The compiler (`src/compiler.js`) is a clean reimplementation replacing the old compiler (6,016 lines) with ~3,293 lines producing identical JavaScript output.
443
-
444
- ## Structure
445
-
446
- ```
447
- CodeGenerator class
448
- - GENERATORS dispatch table (~55 generators)
449
- - Variable collection (program + function scope)
450
- - Main generate() dispatch
451
- - ~55 generate* methods
452
- - Body/formatting/utility helpers
453
- - Reactive runtime (inline string, ~270 lines)
454
- Compiler class (with shim adapter for new lexer)
455
- Convenience exports
456
- ```
457
-
458
- ## Metadata Bridge
459
-
460
- Two one-line helpers isolate all `new String()` awareness:
461
-
462
- ```javascript
463
- let meta = (node, key) => node instanceof String ? node[key] : undefined;
464
- let str = (node) => node instanceof String ? node.valueOf() : node;
465
- ```
466
-
467
- The `Compiler` class's lexer adapter reconstructs `new String()` wrapping from the new lexer's `token.data` property, so grammar actions pass metadata through s-expressions unchanged.
468
-
469
- ## Removed Generators
470
-
471
- | Generator | S-expr | Reason |
472
- |-----------|--------|--------|
473
- | `generatePrototype` | `::` | Feature removed from lexer |
474
- | `generateOptionalPrototype` | `?::` | Feature removed from lexer |
475
-
476
- ## Renamed: `for-from` → `for-as`
477
-
478
- - `GENERATORS['for-as']` replaces `GENERATORS['for-from']`
479
- - Grammar adds `FORASAWAIT` token: `for x as! iter` → `for await x as iter`
480
- - Both forms produce the same s-expression: `["for-as", vars, iterable, true, guard, body]`
481
-
482
- ## Consolidation
483
-
484
- | Area | Old lines | New lines | Reduction |
485
- |------|-----------|-----------|-----------|
486
- | Total file | 6,016 | ~3,290 | **45%** |
487
- | Body generation | ~500 | ~200 | 60% |
488
- | Variable collection | ~230 | ~100 | 57% |
489
- | Helper methods | ~600 | ~250 | 58% |
490
-
491
- ---
492
-
493
- # 7. Solar Parser Generator
494
-
495
- **Solar** is a complete SLR(1) parser generator included with Rip — written in Rip, compiled by Rip, zero external dependencies.
496
-
497
- **Location:** `src/grammar/solar.rip` (929 lines)
498
-
499
- ## Grammar Syntax
500
-
501
- ```coffeescript
502
- o = (pattern, action, options) ->
503
- pattern = pattern.trim().replace /\s{2,}/g, ' '
504
- [pattern, action ? 1, options]
505
- ```
506
-
507
- **Style 1: Pass-Through** — Omit action, returns first token:
508
- ```coffeescript
509
- Expression: [
510
- o 'Value'
511
- o 'Operation'
512
- ]
513
- ```
514
-
515
- **Style 2: S-Expression** — Bare numbers become token references:
516
- ```coffeescript
517
- For: [
518
- o 'FOR ForVariables FOROF Expression Block', '["for-of", 2, 4, null, 5]'
519
- ]
520
- ```
521
-
522
- **Style 3: Advanced** — `$n` patterns for conditional logic:
523
- ```coffeescript
524
- Parenthetical: [
525
- o '( Body )', '$2.length === 1 ? $2[0] : $2'
526
- ]
527
- ```
528
-
529
- ## Performance
530
-
531
- | Metric | Jison | Solar |
532
- |--------|-------|-------|
533
- | Parse time | 12,500ms | ~50ms |
534
- | Dependencies | Many | Zero |
535
- | Self-hosting | No | Yes |
536
- | Code size | 2,285 LOC | 929 LOC |
537
-
538
- After modifying `src/grammar/grammar.rip`:
539
-
540
- ```bash
541
- bun run parser # Regenerates src/parser.js
542
- ```
543
-
544
- ---
545
-
546
- # 8. Debug Tools
547
-
548
- ```bash
549
- # See tokens from lexer
550
- echo 'x = 42' | ./bin/rip -t
551
-
552
- # See s-expressions from parser
553
- echo 'x = 42' | ./bin/rip -s
554
-
555
- # See generated JavaScript
556
- echo 'x = 42' | ./bin/rip -c
557
-
558
- # Interactive REPL with debug modes
559
- ./bin/rip
560
- rip> .tokens # Toggle token display
561
- rip> .sexp # Toggle s-expression display
562
- rip> .js # Toggle JS display
563
- ```
564
-
565
- ## REPL Architecture
566
-
567
- The CLI REPL (`src/repl.js`) uses a `vm`-based sandbox with two eval paths:
568
-
569
- - **Primary:** `vm.runInContext` with async IIFE wrapper — handles all code including `await`
570
- - **Fallback:** `vm.SourceTextModule` — used only when `import` statements are present (required for module linking)
571
-
572
- This split works around Bun bug [#22287](https://github.com/oven-sh/bun/issues/22287) where `vm.SourceTextModule` crashes on top-level `await`.
573
-
574
- The stdlib is injected once at REPL startup via `getStdlibCode()` — the same function the compiler uses, ensuring a single source of truth. Variables persist across evaluations via a `__vars` object.
575
-
576
- The browser REPL (`docs/index.html`) uses a hidden iframe as its sandbox. Code is compiled with `skipPreamble: true` and wrapped in an async IIFE for `await` support. The stdlib and reactive runtime are injected into the iframe context at initialization.
577
-
578
- ---
579
-
580
- # 9. Future Work
581
-
582
- - Parser update to read `.data` directly instead of `new String()` properties
583
- - Once parser supports `.data`, the `meta()`/`str()` helpers become trivial to update
584
-
585
- ---
586
-
587
- **See Also:**
588
- - [RIP-LANG.md](RIP-LANG.md) — Language reference (includes reactivity deep dive)
589
- - [RIP-TYPES.md](RIP-TYPES.md) — Type system specification
590
-
591
- ---
592
-
593
- *Rip 3.13 — 1,265 tests — Zero dependencies — Self-hosting — ~13,500 LOC*