redscript-mc 1.2.30 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (269) hide show
  1. package/.claude/commands/build-test.md +10 -0
  2. package/.claude/commands/deploy-demo.md +12 -0
  3. package/.claude/commands/stage-status.md +13 -0
  4. package/.claude/settings.json +12 -0
  5. package/.github/workflows/ci.yml +1 -0
  6. package/CLAUDE.md +231 -0
  7. package/demo.gif +0 -0
  8. package/dist/cli.js +2 -554
  9. package/dist/compile.js +2 -266
  10. package/dist/index.js +2 -159
  11. package/dist/lowering/index.js +5 -3
  12. package/dist/src/__tests__/cli.test.d.ts +1 -0
  13. package/dist/src/__tests__/cli.test.js +104 -0
  14. package/dist/src/__tests__/codegen.test.d.ts +1 -0
  15. package/dist/src/__tests__/codegen.test.js +152 -0
  16. package/dist/src/__tests__/compile-all.test.d.ts +10 -0
  17. package/dist/src/__tests__/compile-all.test.js +108 -0
  18. package/dist/src/__tests__/dce.test.d.ts +1 -0
  19. package/dist/src/__tests__/dce.test.js +102 -0
  20. package/dist/src/__tests__/diagnostics.test.d.ts +4 -0
  21. package/dist/src/__tests__/diagnostics.test.js +177 -0
  22. package/dist/src/__tests__/e2e.test.d.ts +6 -0
  23. package/dist/src/__tests__/e2e.test.js +1789 -0
  24. package/dist/src/__tests__/entity-types.test.d.ts +1 -0
  25. package/dist/src/__tests__/entity-types.test.js +203 -0
  26. package/dist/src/__tests__/formatter.test.d.ts +1 -0
  27. package/dist/src/__tests__/formatter.test.js +40 -0
  28. package/dist/src/__tests__/lexer.test.d.ts +1 -0
  29. package/dist/src/__tests__/lexer.test.js +343 -0
  30. package/dist/src/__tests__/lowering.test.d.ts +1 -0
  31. package/dist/src/__tests__/lowering.test.js +1015 -0
  32. package/dist/src/__tests__/macro.test.d.ts +8 -0
  33. package/dist/src/__tests__/macro.test.js +306 -0
  34. package/dist/src/__tests__/mc-integration.test.d.ts +12 -0
  35. package/dist/src/__tests__/mc-integration.test.js +817 -0
  36. package/dist/src/__tests__/mc-syntax.test.d.ts +1 -0
  37. package/dist/src/__tests__/mc-syntax.test.js +124 -0
  38. package/dist/src/__tests__/nbt.test.d.ts +1 -0
  39. package/dist/src/__tests__/nbt.test.js +82 -0
  40. package/dist/src/__tests__/optimizer-advanced.test.d.ts +1 -0
  41. package/dist/src/__tests__/optimizer-advanced.test.js +124 -0
  42. package/dist/src/__tests__/optimizer.test.d.ts +1 -0
  43. package/dist/src/__tests__/optimizer.test.js +149 -0
  44. package/dist/src/__tests__/parser.test.d.ts +1 -0
  45. package/dist/src/__tests__/parser.test.js +807 -0
  46. package/dist/src/__tests__/repl.test.d.ts +1 -0
  47. package/dist/src/__tests__/repl.test.js +27 -0
  48. package/dist/src/__tests__/runtime.test.d.ts +1 -0
  49. package/dist/src/__tests__/runtime.test.js +289 -0
  50. package/dist/src/__tests__/stdlib-advanced.test.d.ts +4 -0
  51. package/dist/src/__tests__/stdlib-advanced.test.js +374 -0
  52. package/dist/src/__tests__/stdlib-bigint.test.d.ts +7 -0
  53. package/dist/src/__tests__/stdlib-bigint.test.js +426 -0
  54. package/dist/src/__tests__/stdlib-math.test.d.ts +7 -0
  55. package/dist/src/__tests__/stdlib-math.test.js +351 -0
  56. package/dist/src/__tests__/stdlib-vec.test.d.ts +4 -0
  57. package/dist/src/__tests__/stdlib-vec.test.js +263 -0
  58. package/dist/src/__tests__/structure-optimizer.test.d.ts +1 -0
  59. package/dist/src/__tests__/structure-optimizer.test.js +33 -0
  60. package/dist/src/__tests__/typechecker.test.d.ts +1 -0
  61. package/dist/src/__tests__/typechecker.test.js +552 -0
  62. package/dist/src/__tests__/var-allocator.test.d.ts +1 -0
  63. package/dist/src/__tests__/var-allocator.test.js +69 -0
  64. package/dist/src/ast/types.d.ts +515 -0
  65. package/dist/src/ast/types.js +9 -0
  66. package/dist/src/builtins/metadata.d.ts +36 -0
  67. package/dist/src/builtins/metadata.js +1014 -0
  68. package/dist/src/cli.d.ts +11 -0
  69. package/dist/src/cli.js +443 -0
  70. package/dist/src/codegen/cmdblock/index.d.ts +26 -0
  71. package/dist/src/codegen/cmdblock/index.js +45 -0
  72. package/dist/src/codegen/mcfunction/index.d.ts +40 -0
  73. package/dist/src/codegen/mcfunction/index.js +606 -0
  74. package/dist/src/codegen/structure/index.d.ts +24 -0
  75. package/dist/src/codegen/structure/index.js +279 -0
  76. package/dist/src/codegen/var-allocator.d.ts +45 -0
  77. package/dist/src/codegen/var-allocator.js +104 -0
  78. package/dist/src/compile.d.ts +37 -0
  79. package/dist/src/compile.js +165 -0
  80. package/dist/src/diagnostics/index.d.ts +44 -0
  81. package/dist/src/diagnostics/index.js +140 -0
  82. package/dist/src/events/types.d.ts +35 -0
  83. package/dist/src/events/types.js +59 -0
  84. package/dist/src/formatter/index.d.ts +1 -0
  85. package/dist/src/formatter/index.js +26 -0
  86. package/dist/src/index.d.ts +22 -0
  87. package/dist/src/index.js +45 -0
  88. package/dist/src/ir/builder.d.ts +33 -0
  89. package/dist/src/ir/builder.js +99 -0
  90. package/dist/src/ir/types.d.ts +132 -0
  91. package/dist/src/ir/types.js +15 -0
  92. package/dist/src/lexer/index.d.ts +37 -0
  93. package/dist/src/lexer/index.js +569 -0
  94. package/dist/src/lowering/index.d.ts +188 -0
  95. package/dist/src/lowering/index.js +3405 -0
  96. package/dist/src/mc-test/client.d.ts +128 -0
  97. package/dist/src/mc-test/client.js +174 -0
  98. package/dist/src/mc-test/runner.d.ts +28 -0
  99. package/dist/src/mc-test/runner.js +151 -0
  100. package/dist/src/mc-test/setup.d.ts +11 -0
  101. package/dist/src/mc-test/setup.js +98 -0
  102. package/dist/src/mc-validator/index.d.ts +17 -0
  103. package/dist/src/mc-validator/index.js +322 -0
  104. package/dist/src/nbt/index.d.ts +86 -0
  105. package/dist/src/nbt/index.js +250 -0
  106. package/dist/src/optimizer/commands.d.ts +38 -0
  107. package/dist/src/optimizer/commands.js +451 -0
  108. package/dist/src/optimizer/dce.d.ts +34 -0
  109. package/dist/src/optimizer/dce.js +639 -0
  110. package/dist/src/optimizer/passes.d.ts +34 -0
  111. package/dist/src/optimizer/passes.js +243 -0
  112. package/dist/src/optimizer/structure.d.ts +9 -0
  113. package/dist/src/optimizer/structure.js +356 -0
  114. package/dist/src/parser/index.d.ts +93 -0
  115. package/dist/src/parser/index.js +1687 -0
  116. package/dist/src/repl.d.ts +16 -0
  117. package/dist/src/repl.js +165 -0
  118. package/dist/src/runtime/index.d.ts +107 -0
  119. package/dist/src/runtime/index.js +1409 -0
  120. package/dist/src/typechecker/index.d.ts +61 -0
  121. package/dist/src/typechecker/index.js +1034 -0
  122. package/dist/src/types/entity-hierarchy.d.ts +29 -0
  123. package/dist/src/types/entity-hierarchy.js +107 -0
  124. package/dist/src2/__tests__/e2e/basic.test.d.ts +8 -0
  125. package/dist/src2/__tests__/e2e/basic.test.js +140 -0
  126. package/dist/src2/__tests__/e2e/macros.test.d.ts +9 -0
  127. package/dist/src2/__tests__/e2e/macros.test.js +182 -0
  128. package/dist/src2/__tests__/e2e/migrate.test.d.ts +13 -0
  129. package/dist/src2/__tests__/e2e/migrate.test.js +2739 -0
  130. package/dist/src2/__tests__/hir/desugar.test.d.ts +1 -0
  131. package/dist/src2/__tests__/hir/desugar.test.js +234 -0
  132. package/dist/src2/__tests__/lir/lower.test.d.ts +1 -0
  133. package/dist/src2/__tests__/lir/lower.test.js +559 -0
  134. package/dist/src2/__tests__/lir/types.test.d.ts +1 -0
  135. package/dist/src2/__tests__/lir/types.test.js +185 -0
  136. package/dist/src2/__tests__/lir/verify.test.d.ts +1 -0
  137. package/dist/src2/__tests__/lir/verify.test.js +221 -0
  138. package/dist/src2/__tests__/mir/arithmetic.test.d.ts +1 -0
  139. package/dist/src2/__tests__/mir/arithmetic.test.js +130 -0
  140. package/dist/src2/__tests__/mir/control-flow.test.d.ts +1 -0
  141. package/dist/src2/__tests__/mir/control-flow.test.js +205 -0
  142. package/dist/src2/__tests__/mir/verify.test.d.ts +1 -0
  143. package/dist/src2/__tests__/mir/verify.test.js +223 -0
  144. package/dist/src2/__tests__/optimizer/block_merge.test.d.ts +1 -0
  145. package/dist/src2/__tests__/optimizer/block_merge.test.js +78 -0
  146. package/dist/src2/__tests__/optimizer/branch_simplify.test.d.ts +1 -0
  147. package/dist/src2/__tests__/optimizer/branch_simplify.test.js +58 -0
  148. package/dist/src2/__tests__/optimizer/constant_fold.test.d.ts +1 -0
  149. package/dist/src2/__tests__/optimizer/constant_fold.test.js +131 -0
  150. package/dist/src2/__tests__/optimizer/copy_prop.test.d.ts +1 -0
  151. package/dist/src2/__tests__/optimizer/copy_prop.test.js +91 -0
  152. package/dist/src2/__tests__/optimizer/dce.test.d.ts +1 -0
  153. package/dist/src2/__tests__/optimizer/dce.test.js +76 -0
  154. package/dist/src2/__tests__/optimizer/pipeline.test.d.ts +1 -0
  155. package/dist/src2/__tests__/optimizer/pipeline.test.js +102 -0
  156. package/dist/src2/emit/compile.d.ts +19 -0
  157. package/dist/src2/emit/compile.js +80 -0
  158. package/dist/src2/emit/index.d.ts +17 -0
  159. package/dist/src2/emit/index.js +172 -0
  160. package/dist/src2/hir/lower.d.ts +15 -0
  161. package/dist/src2/hir/lower.js +378 -0
  162. package/dist/src2/hir/types.d.ts +373 -0
  163. package/dist/src2/hir/types.js +16 -0
  164. package/dist/src2/lir/lower.d.ts +15 -0
  165. package/dist/src2/lir/lower.js +453 -0
  166. package/dist/src2/lir/types.d.ts +136 -0
  167. package/dist/src2/lir/types.js +11 -0
  168. package/dist/src2/lir/verify.d.ts +14 -0
  169. package/dist/src2/lir/verify.js +113 -0
  170. package/dist/src2/mir/lower.d.ts +9 -0
  171. package/dist/src2/mir/lower.js +1030 -0
  172. package/dist/src2/mir/macro.d.ts +22 -0
  173. package/dist/src2/mir/macro.js +168 -0
  174. package/dist/src2/mir/types.d.ts +183 -0
  175. package/dist/src2/mir/types.js +11 -0
  176. package/dist/src2/mir/verify.d.ts +16 -0
  177. package/dist/src2/mir/verify.js +216 -0
  178. package/dist/src2/optimizer/block_merge.d.ts +12 -0
  179. package/dist/src2/optimizer/block_merge.js +84 -0
  180. package/dist/src2/optimizer/branch_simplify.d.ts +9 -0
  181. package/dist/src2/optimizer/branch_simplify.js +28 -0
  182. package/dist/src2/optimizer/constant_fold.d.ts +10 -0
  183. package/dist/src2/optimizer/constant_fold.js +85 -0
  184. package/dist/src2/optimizer/copy_prop.d.ts +9 -0
  185. package/dist/src2/optimizer/copy_prop.js +113 -0
  186. package/dist/src2/optimizer/dce.d.ts +8 -0
  187. package/dist/src2/optimizer/dce.js +155 -0
  188. package/dist/src2/optimizer/pipeline.d.ts +10 -0
  189. package/dist/src2/optimizer/pipeline.js +42 -0
  190. package/dist/tsconfig.tsbuildinfo +1 -0
  191. package/docs/compiler-pipeline-redesign.md +2243 -0
  192. package/docs/optimization-ideas.md +1076 -0
  193. package/editors/vscode/package-lock.json +3 -3
  194. package/editors/vscode/package.json +1 -1
  195. package/jest.config.js +1 -1
  196. package/package.json +6 -5
  197. package/scripts/postbuild.js +15 -0
  198. package/src/__tests__/cli.test.ts +8 -220
  199. package/src/__tests__/dce.test.ts +11 -56
  200. package/src/__tests__/diagnostics.test.ts +59 -38
  201. package/src/__tests__/mc-integration.test.ts +1 -2
  202. package/src/ast/types.ts +6 -1
  203. package/src/cli.ts +29 -156
  204. package/src/compile.ts +6 -162
  205. package/src/index.ts +14 -178
  206. package/src/mc-test/runner.ts +4 -3
  207. package/src/parser/index.ts +1 -1
  208. package/src/repl.ts +1 -1
  209. package/src/runtime/index.ts +1 -1
  210. package/src2/__tests__/e2e/basic.test.ts +154 -0
  211. package/src2/__tests__/e2e/macros.test.ts +199 -0
  212. package/src2/__tests__/e2e/migrate.test.ts +3008 -0
  213. package/src2/__tests__/hir/desugar.test.ts +263 -0
  214. package/src2/__tests__/lir/lower.test.ts +619 -0
  215. package/src2/__tests__/lir/types.test.ts +207 -0
  216. package/src2/__tests__/lir/verify.test.ts +249 -0
  217. package/src2/__tests__/mir/arithmetic.test.ts +156 -0
  218. package/src2/__tests__/mir/control-flow.test.ts +242 -0
  219. package/src2/__tests__/mir/verify.test.ts +254 -0
  220. package/src2/__tests__/optimizer/block_merge.test.ts +84 -0
  221. package/src2/__tests__/optimizer/branch_simplify.test.ts +64 -0
  222. package/src2/__tests__/optimizer/constant_fold.test.ts +145 -0
  223. package/src2/__tests__/optimizer/copy_prop.test.ts +99 -0
  224. package/src2/__tests__/optimizer/dce.test.ts +83 -0
  225. package/src2/__tests__/optimizer/pipeline.test.ts +116 -0
  226. package/src2/emit/compile.ts +99 -0
  227. package/src2/emit/index.ts +222 -0
  228. package/src2/hir/lower.ts +428 -0
  229. package/src2/hir/types.ts +216 -0
  230. package/src2/lir/lower.ts +556 -0
  231. package/src2/lir/types.ts +109 -0
  232. package/src2/lir/verify.ts +129 -0
  233. package/src2/mir/lower.ts +1160 -0
  234. package/src2/mir/macro.ts +167 -0
  235. package/src2/mir/types.ts +106 -0
  236. package/src2/mir/verify.ts +218 -0
  237. package/src2/optimizer/block_merge.ts +93 -0
  238. package/src2/optimizer/branch_simplify.ts +27 -0
  239. package/src2/optimizer/constant_fold.ts +88 -0
  240. package/src2/optimizer/copy_prop.ts +106 -0
  241. package/src2/optimizer/dce.ts +133 -0
  242. package/src2/optimizer/pipeline.ts +44 -0
  243. package/tsconfig.json +2 -2
  244. package/src/__tests__/codegen.test.ts +0 -161
  245. package/src/__tests__/e2e.test.ts +0 -2039
  246. package/src/__tests__/entity-types.test.ts +0 -236
  247. package/src/__tests__/lowering.test.ts +0 -1185
  248. package/src/__tests__/macro.test.ts +0 -343
  249. package/src/__tests__/nbt.test.ts +0 -58
  250. package/src/__tests__/optimizer-advanced.test.ts +0 -144
  251. package/src/__tests__/optimizer.test.ts +0 -162
  252. package/src/__tests__/runtime.test.ts +0 -305
  253. package/src/__tests__/stdlib-advanced.test.ts +0 -379
  254. package/src/__tests__/stdlib-bigint.test.ts +0 -427
  255. package/src/__tests__/stdlib-math.test.ts +0 -374
  256. package/src/__tests__/stdlib-vec.test.ts +0 -259
  257. package/src/__tests__/structure-optimizer.test.ts +0 -38
  258. package/src/__tests__/var-allocator.test.ts +0 -75
  259. package/src/codegen/cmdblock/index.ts +0 -63
  260. package/src/codegen/mcfunction/index.ts +0 -662
  261. package/src/codegen/structure/index.ts +0 -346
  262. package/src/codegen/var-allocator.ts +0 -104
  263. package/src/ir/builder.ts +0 -116
  264. package/src/ir/types.ts +0 -134
  265. package/src/lowering/index.ts +0 -3876
  266. package/src/optimizer/commands.ts +0 -534
  267. package/src/optimizer/dce.ts +0 -679
  268. package/src/optimizer/passes.ts +0 -250
  269. package/src/optimizer/structure.ts +0 -450
@@ -0,0 +1,2243 @@
1
+ # RedScript Compiler Pipeline Redesign
2
+
3
+ > Status: **planned** — target next major refactor cycle
4
+ > Written: 2026-03-15
5
+
6
+ ---
7
+
8
+ ## Motivation
9
+
10
+ The current compiler is a single-pass lowering that goes roughly
11
+ `Parser → AST → IR (2-address) → MCFunction`.
12
+ It works, but has accumulated tech debt that makes further optimizations fragile:
13
+
14
+ - IR is 2-address, which complicates use-def analysis and CSE
15
+ - "Optimization" and "lowering" are interleaved in the same pass
16
+ - Macro handling, builtin dispatch, and control-flow lowering all happen together
17
+ - Adding a new optimization often requires touching 3+ files
18
+
19
+ The plan below separates concerns cleanly into 7 stages.
20
+
21
+ ---
22
+
23
+ ## Stage 1 — Frontend: Parser → AST
24
+
25
+ **Responsibilities:** source text → well-formed typed syntax tree
26
+
27
+ - Lexing / parsing (already solid)
28
+ - Name resolution (resolve identifiers to their declarations)
29
+ - Type checking (infer and check all expression types)
30
+ - Scope analysis (closures, shadowing, struct fields)
31
+
32
+ **Output:** a fully type-annotated AST where every node carries its type.
33
+
34
+ No desugaring here. Keep AST faithful to source.
35
+
36
+ ---
37
+
38
+ ## Stage 2 — AST → HIR *(High-level IR)*
39
+
40
+ **Goal:** eliminate syntax sugar; keep structured control flow.
41
+
42
+ Transforms applied:
43
+
44
+ | Source construct | HIR form |
45
+ |---|---|
46
+ | `for (init; cond; step)` | `while` loop with explicit init/step |
47
+ | `a += b` | `a = a + b` (or dedicated `Op` node) |
48
+ | `let x = complex_expr` | declaration + separate assignment |
49
+ | `a && b` | `if a { b } else { false }` |
50
+ | `a \|\| b` | `if a { true } else { b }` |
51
+ | `cond ? a : b` | `if cond { a } else { b }` |
52
+ | comma expressions | sequential statements |
53
+ | `foreach` | explicit iterator variable + while |
54
+
55
+ HIR is still **structured** (no gotos, no basic blocks yet).
56
+ All types are known; all names are resolved.
57
+
58
+ ---
59
+
60
+ ## Stage 3 — HIR → MIR *(Mid-level IR, 3-address CFG)*
61
+
62
+ **Goal:** structured control flow → explicit Control Flow Graph (CFG).
63
+
64
+ - Introduce basic blocks with explicit predecessors/successors
65
+ - Introduce unlimited fresh temporaries
66
+ - **3-address form**: every instruction has at most one operation
67
+
68
+ ```
69
+ # MIR example
70
+ t1 = add a, b
71
+ t2 = mul t1, c
72
+ x = mov t2
73
+ ```
74
+
75
+ Why 3-address (not 2-address like the current IR)?
76
+
77
+ - Use-def chains are trivial to build
78
+ - CSE: identical RHS expressions are immediately comparable
79
+ - Constant propagation: single definition per temp makes dataflow simple
80
+ - Expression reordering: no aliasing through the destination
81
+
82
+ Control flow:
83
+ - `if` → conditional branch + merge block
84
+ - `while` → loop header + body + exit
85
+ - `return` → explicit jump to exit block
86
+ - `break`/`continue` → explicit jumps
87
+
88
+ ---
89
+
90
+ ## Stage 4 — MIR Optimization Passes
91
+
92
+ Run on the 3-address CFG. Passes are composable and independent.
93
+
94
+ ### Required (correctness + baseline perf)
95
+
96
+ | Pass | Description |
97
+ |---|---|
98
+ | **Constant folding** | `t = add 3, 4` → `t = 7` |
99
+ | **Constant propagation** | replace uses of single-def consts with the value |
100
+ | **Copy propagation** | `x = mov y; ... use x` → `... use y` |
101
+ | **Dead code elimination** | remove defs with no live uses |
102
+ | **Unreachable block elimination** | remove blocks with no predecessors |
103
+ | **Block merging** | merge unconditional-jump-only block chains |
104
+ | **Branch simplification** | `if true` / `if false` → unconditional jump |
105
+
106
+ ### High value (significantly smaller output)
107
+
108
+ | Pass | Description |
109
+ |---|---|
110
+ | **Liveness analysis** | compute live sets for all blocks (required by DCE + alloc) |
111
+ | **Temp coalescing** | merge non-interfering temporaries (reduces slots) |
112
+ | **Destination forwarding** | `t = op a, b; x = mov t` → `x = op a, b` when t dead |
113
+ | **Local CSE** | eliminate repeated identical subexpressions within a block |
114
+ | **Small function inlining** | inline trivial callee bodies at call site |
115
+
116
+ ---
117
+
118
+ ## Stage 5 — MIR → LIR *(Low-level IR, Minecraft-friendly)*
119
+
120
+ **Goal:** abstract operations → Minecraft scoreboard semantics.
121
+
122
+ This is where 3-address gets translated to 2-address **with awareness of
123
+ the destination-reuse pattern** that MC scoreboard requires.
124
+
125
+ ```
126
+ # MIR (3-address)
127
+ t1 = add a, b
128
+ t2 = add t1, c
129
+ x = mov t2
130
+
131
+ # LIR output (when t1, t2 not live after)
132
+ ScoreCopy x, a
133
+ ScoreAdd x, b
134
+ ScoreAdd x, c
135
+ ```
136
+
137
+ Key decisions made here:
138
+
139
+ - Which values live in scoreboards vs NBT storage
140
+ - How to represent `execute`-chained subcommands
141
+ - Macro parameter injection points
142
+ - `$(param)` substitution for dynamic coordinates
143
+
144
+ This stage should be *target-specific* but not yet emitting strings.
145
+ LIR instructions are typed MC operations, not raw text.
146
+
147
+ ---
148
+
149
+ ## Stage 6 — LIR Optimization
150
+
151
+ Backend-specific optimizations that only make sense post-lowering:
152
+
153
+ | Pass | Description |
154
+ |---|---|
155
+ | **Scoreboard slot allocation** | minimize number of distinct objective slots used |
156
+ | **`execute` context extraction** | hoist repeated `execute as @p at @s` prefixes |
157
+ | **`execute` chain merging** | `execute A run execute B run cmd` → `execute A B run cmd` |
158
+ | **Guard block merging** | merge adjacent `execute if score ... matches` guards |
159
+ | **NBT/score carrier selection** | decide when to spill to storage vs keep in score |
160
+ | **Peephole** | local pattern rewrites (e.g. `op X = X` → nop) |
161
+ | **Command deduplication** | remove identical adjacent commands |
162
+ | **Function inlining / outlining** | inline trivial functions; outline repeated sequences |
163
+ | **Block layout** | order blocks to minimize `function` call overhead |
164
+
165
+ ---
166
+
167
+ ## Stage 7 — Emission
168
+
169
+ **Goal:** LIR → `.mcfunction` files on disk.
170
+
171
+ - Assign each LIR block to a `namespace:path/block_name` function
172
+ - Emit `function` calls for control flow edges
173
+ - Generate `load.mcfunction`: objective creation, storage init, const table
174
+ - Generate `tick.mcfunction`: `@tick`-annotated function dispatch
175
+ - Emit call graph in dependency order
176
+ - Write sourcemap (IR name → file:line for diagnostics)
177
+
178
+ ---
179
+
180
+ ## Summary
181
+
182
+ ```
183
+ Source
184
+
185
+ ▼ Stage 1
186
+ AST (typed, name-resolved)
187
+
188
+ ▼ Stage 2
189
+ HIR (desugared, structured)
190
+
191
+ ▼ Stage 3
192
+ MIR (3-address CFG)
193
+
194
+ ▼ Stage 4
195
+ MIR' (optimized)
196
+
197
+ ▼ Stage 5
198
+ LIR (MC-friendly 2-address)
199
+
200
+ ▼ Stage 6
201
+ LIR' (backend-optimized)
202
+
203
+ ▼ Stage 7
204
+ .mcfunction files
205
+ ```
206
+
207
+ The key insight: **optimization-friendly representation (Stage 4) and
208
+ target-friendly representation (Stage 6) are separate**. Trying to do
209
+ both at once is why the current IR is hard to extend.
210
+
211
+ ---
212
+
213
+ ## Migration Notes
214
+
215
+ - Current `src/lowering/index.ts` = Stage 2 + 3 + 5 merged → split into three
216
+ - Current `src/optimizer/` = partial Stage 4, operating on 2-address → rewrite around 3-address MIR
217
+ - Current `src/codegen/` = Stage 6 + 7 merged → split at the LIR boundary
218
+ - Current `src/ir/` = needs 3-address extension or replacement
219
+ - Tests: keep end-to-end `.mcrs → .mcfunction` tests as regression suite; add unit tests per stage
220
+
221
+ ---
222
+
223
+ *This document was drafted to guide the next major refactor. Details may change during implementation.*
224
+
225
+ ---
226
+
227
+ ## Tech Stack & Infrastructure Decisions
228
+
229
+ ### Language: stay in TypeScript
230
+
231
+ The MC target is too domain-specific for a general backend (LLVM, Cranelift, QBE)
232
+ to add value. The compilation workload is also small enough (functions are
233
+ typically < 100 MIR instructions) that performance is not a concern.
234
+ TS gives good type safety for IR node types and is already the existing codebase.
235
+
236
+ ### SSA: no, use versioned temporaries
237
+
238
+ | | SSA | Versioned temps |
239
+ |---|---|---|
240
+ | Constant prop | trivial one-pass | fixed-point iteration needed |
241
+ | DCE | trivial | single backward sweep |
242
+ | Copy prop | trivial | one extra level of indirection |
243
+ | Construction | dominator tree + φ-insertion | trivial, just increment a counter |
244
+ | Deconstruction | must run before LIR | N/A |
245
+
246
+ For function bodies in the 20–200 instruction range with no complex loop
247
+ induction variable analysis, versioned temps are sufficient. SSA complexity
248
+ is not justified at this scale.
249
+
250
+ ### Pass framework: nanopass style
251
+
252
+ Each optimization pass should be a pure function:
253
+
254
+ ```typescript
255
+ type Pass = (module: MIRModule) => MIRModule
256
+ ```
257
+
258
+ - No mutation of shared global state
259
+ - Can be verified with `verifyMIR(module)` between passes
260
+ - Pipeline is just an array of passes: `const pipeline: Pass[] = [constantFold, copyProp, dce, ...]`
261
+ - Easy to toggle a pass for debugging
262
+ - Easy to add `before/after` IR dumps per pass
263
+
264
+ Current code mixes optimization and lowering in the same methods.
265
+ The nanopass shape forces separation.
266
+
267
+ ### What to reuse from the current codebase
268
+
269
+ | Module | Verdict | Notes |
270
+ |---|---|---|
271
+ | `src/parser/` | **keep as-is** | solid, already produces typed AST |
272
+ | `src/lexer/` | **keep as-is** | — |
273
+ | `src/runtime/` | **keep as-is** | MC runtime simulator used in tests |
274
+ | `src/__tests__/` | **keep e2e tests** | regression suite covering `.mcrs → .mcfunction` |
275
+ | `src/optimizer/passes.ts` | **port logic, rewrite impl** | copy the *idea*, not the regex machinery |
276
+ | `src/optimizer/commands.ts` | **discard** | regex-based command matching, replace with typed LIR |
277
+ | `src/ir/` | **replace** | extend with 3-address form and explicit CFG |
278
+ | `src/lowering/` | **split into Stage 2+3+5** | 3,500-line file doing too many things |
279
+ | `src/codegen/` | **split into Stage 6+7** | keep emission logic, rebuild on new LIR |
280
+
281
+ ---
282
+
283
+ ## Current Architecture (as of v1.2.x)
284
+
285
+ Knowing what we have makes migration planning concrete.
286
+
287
+ ```
288
+ src/
289
+ lexer/index.ts Tokenizer
290
+ parser/index.ts Recursive-descent parser → AST
291
+ lowering/index.ts AST → IR (3,500 lines; Stages 2+3+5 merged)
292
+ ir/index.ts IR types: IRModule, IRFunction, IRBlock, IRInstr
293
+ optimizer/
294
+ passes.ts Optimization passes on 2-addr IR
295
+ commands.ts Regex-based command analysis (OBJ pattern etc.)
296
+ structure.ts Structural analysis helpers
297
+ codegen/
298
+ mcfunction/index.ts IR → .mcfunction text files
299
+ compile.ts Top-level compile() entry point
300
+ cli.ts CLI wrapper
301
+ runtime/index.ts MCRuntime: scoreboard + storage simulator for tests
302
+ stdlib/
303
+ math.mcrs sin/cos/sqrt tables + trig, 91-entry lookup
304
+ vec.mcrs 2D/3D vector ops using fixed-point
305
+ advanced.mcrs smoothstep, smootherstep, clamp, etc.
306
+ bigint.mcrs 8-limb base-10000 BigInt
307
+ timer.mcrs single-instance Timer (tick countdown)
308
+ ```
309
+
310
+ The IR is 2-address:
311
+
312
+ ```
313
+ x = a (copy)
314
+ x += b (in-place add)
315
+ x *= c (in-place mul)
316
+ ```
317
+
318
+ Arithmetic sequences are modeled as chains of in-place updates on a single
319
+ destination, which obscures the value-dependency graph and complicates CSE.
320
+
321
+ Optimization passes operate on `IRInstr` objects that contain raw MC command
322
+ strings. Several passes (copy propagation, CSE, block merge) parse those strings
323
+ with regular expressions to extract slot names and objective names, which is
324
+ fragile and tightly coupled to the objective naming scheme.
325
+
326
+ ---
327
+
328
+ ## Lessons Learned / Design Pitfalls
329
+
330
+ These are real bugs or design limitations that shaped the current codebase.
331
+ The redesign should address all of them.
332
+
333
+ ### 1. Global mutable objective name state
334
+
335
+ **Problem:** The scoreboard objective name (`rs`, then `__namespace`) is stored
336
+ in a module-level mutable variable. To support multiple datapacks in one process,
337
+ we had to add `setScoreboardObjective()`, `setOptimizerObjective()`,
338
+ `setStructureObjective()` — three separate setters across three files.
339
+ The optimizer's regex patterns also had to be regenerated dynamically.
340
+
341
+ **Root cause:** objective name was a constant baked into every pass.
342
+
343
+ **Fix in redesign:** Pass a `CompileContext` record through the entire pipeline.
344
+ No global state.
345
+
346
+ ---
347
+
348
+ ### 2. Optimizer regex matching on command strings
349
+
350
+ **Problem:** Copy propagation, CSE, and block merge all pattern-match on raw
351
+ MC command text like `"scoreboard players operation $x rs = $y rs"`.
352
+ When the objective name changed from `rs` to `__namespace`, every regex had
353
+ to be updated. When the regex didn't account for a case (e.g. `$x rs 0`),
354
+ the pass silently failed.
355
+
356
+ **Root cause:** IR instructions are strings, not typed nodes.
357
+
358
+ **Fix in redesign:** LIR instructions are typed (e.g. `ScoreCopy`, `ScoreAdd`).
359
+ Passes pattern-match on structured nodes, not strings.
360
+
361
+ ---
362
+
363
+ ### 3. Lowering, desugaring, and macro detection all in one pass
364
+
365
+ **Problem:** `src/lowering/index.ts` is 3,500 lines that simultaneously:
366
+ - Desugar `for`/`+=`/ternary
367
+ - Build basic blocks and terminators
368
+ - Handle builtin dispatch (particle, setblock, tp, ...)
369
+ - Detect macro parameters and rewrite coordinates as `$(param)`
370
+ - Manage function specialization for stdlib callbacks
371
+ - Track struct fields and impl method dispatch
372
+
373
+ Adding any new feature requires understanding all of this at once.
374
+
375
+ **Fix in redesign:** Each of these is a separate stage.
376
+
377
+ ---
378
+
379
+ ### 4. `\x01` sentinel for macro line prefix
380
+
381
+ **Problem:** When a builtin command needed a `$` prefix for MC macro syntax
382
+ (e.g. `$particle ... ^$(px) ...`), the lowering used a literal `$` prefix.
383
+ The codegen's `resolveRaw()` then saw `$particle` and allocated a fresh
384
+ temporary named `particle`, replacing the `$` prefix with `$v` (or whatever
385
+ the temp was allocated as). The particle command silently became `$v minecraft:end_rod ...`
386
+ which MC ignored.
387
+
388
+ **Root cause:** The `$var` variable reference syntax and the MC macro `$` line
389
+ prefix shared the same sigil in raw command strings.
390
+
391
+ **Fix applied:** Use `\x01` as sentinel in IR; codegen converts `\x01` → `$`
392
+ after variable resolution.
393
+
394
+ **Fix in redesign:** Typed LIR instruction `MacroParticle { ... }` — no raw
395
+ string parsing needed.
396
+
397
+ ---
398
+
399
+ ### 5. Cross-function variable name collision
400
+
401
+ **Problem:** Two functions `foo` and `bar` could each declare a variable `x`,
402
+ both getting lowered to scoreboard slot `$x`. If both were inlined or called
403
+ in the same tick context, they shared the slot.
404
+
405
+ **Fix applied:** IR variable names are scoped as `$fnname_varname`.
406
+
407
+ **Fix in redesign:** 3-address MIR uses globally-unique temporaries (counter-based).
408
+ Slot allocation is a separate explicit pass.
409
+
410
+ ---
411
+
412
+ ### 6. `mc_name` early-return bypassed `#rs` resolution
413
+
414
+ **Problem:** In `exprToScoreboardObjective`, the handler for `mc_name` returned
415
+ `expr.value` directly, bypassing the `#rs → LOWERING_OBJ` special case.
416
+ All timer stdlib tests that used `#rs` as the objective were matching the
417
+ literal string `"rs"` instead of the namespace-specific objective.
418
+
419
+ **Root cause:** Early-return before the special-case check.
420
+
421
+ **Fix applied:** Check `value === 'rs'` before the early return.
422
+
423
+ **Fix in redesign:** Objective references should be a first-class IR type,
424
+ not a string that might be the literal `"rs"` or the special token `"rs"`.
425
+
426
+ ---
427
+
428
+ ### 7. Timer is single-instance
429
+
430
+ **Problem:** `timer.mcrs` stores tick count and active state on fake players
431
+ `timer_ticks` and `timer_active`. All Timer instances share the same player,
432
+ so only one Timer can be active at a time.
433
+
434
+ **Root cause:** No per-instance storage mechanism. The `_id` field was stubbed
435
+ but never implemented.
436
+
437
+ **Path forward:** Per-instance state needs either:
438
+ - Named fake player per instance: `timer_1_ticks`, `timer_2_ticks`, ... (requires macro `$-prefixed scoreboard` commands)
439
+ - NBT array slot per instance (same pattern as BigInt limbs)
440
+
441
+ ---
442
+
443
+ ### 8. `^varname` not supported in lexer until v1.2.x
444
+
445
+ **Problem:** `^px` (local coordinate with variable offset) was lexed as two
446
+ tokens: `^` (local_coord) + `px` (ident). The parser then failed with
447
+ "Expected ')' but got 'ident'".
448
+
449
+ Only `~varname` (relative coordinate) supported variable names.
450
+ This made the macro-based dynamic particle positioning impossible to write.
451
+
452
+ **Fix applied:** Lexer now reads `^identifier` as a single `local_coord` token.
453
+
454
+ **Fix in redesign:** `^varname` and `~varname` should be unified as
455
+ `LocalCoord(varname | number)` and `RelCoord(varname | number)` in the AST.
456
+
457
+ ---
458
+
459
+ ### 9. sin_fixed is a lookup table, and that is correct
460
+
461
+ Not a pitfall — a deliberate constraint.
462
+
463
+ MC scoreboards support only 32-bit integer arithmetic (add, sub, mul, div, mod).
464
+ There is no trigonometric instruction. Taylor series (`sin x = x - x³/6 + ...`)
465
+ overflows INT32 by the third term in fixed-point ×1000 representation.
466
+ CORDIC requires ~20 integer iterations per call.
467
+
468
+ A 91-entry table (0°–90° with quadrant symmetry) gives exact 1° resolution
469
+ in O(1) and is the standard approach on integer-only platforms (GBA BIOS,
470
+ early DSP chips, NDS).
471
+
472
+ **Implication for redesign:** The `sin_fixed` table pattern (initialized at
473
+ `@load`, read via storage array indexing + macros) is a first-class language
474
+ pattern, not a hack. The stdlib should keep it.
475
+
476
+ ---
477
+
478
+ ### 10. Datapack objective collision (the `rs` problem)
479
+
480
+ **Problem:** All compiled datapacks shared a single scoreboard objective named
481
+ `rs`. Two datapacks in the same world had their mangle tables collide — the
482
+ `$a rs` slot meant different things in each datapack's load function.
483
+
484
+ **Fix applied:** Default objective is now `__<namespace>` (double-underscore
485
+ prefix, following the `__load`/`__tick` convention).
486
+
487
+ **Fix in redesign:** `CompileContext` carries the objective name. No global.
488
+
489
+ ---
490
+
491
+ ## Language Design: TypeScript Syntax, Custom Frontend
492
+
493
+ ### Should we reuse the TypeScript frontend (tsc / ts-morph)?
494
+
495
+ **No.** The core RedScript syntax is not valid TypeScript:
496
+
497
+ ```redscript
498
+ foreach (p in @a[tag=foo, limit=1]) at @s {
499
+ particle("end_rod", ^0, ^0, ^5, 0.02, 0.02, 0.02, 0, 10);
500
+ }
501
+ kill(@e[tag=screen]);
502
+ ```
503
+
504
+ `@a[tag=foo]` is not a valid TS expression (confused with array-access on a
505
+ decorator). `^5` / `~-3` are not valid TS expressions. `at @s {}` does not
506
+ exist. Encoding these as valid TS trades the language for a verbose API:
507
+
508
+ ```typescript
509
+ // not the goal
510
+ forEach(selector('@a', { tag: 'foo', limit: 1 }), (p) =>
511
+ atSelf(p, () => particle('end_rod', localCoord(0, 0, 5), ...)));
512
+ ```
513
+
514
+ If we have to write that, RedScript provides no value. Keep the custom parser.
515
+
516
+ ### Should we reuse TypeScript's type checker (tsc)?
517
+
518
+ **No.** RedScript only needs a small subset of TypeScript's type system:
519
+
520
+ | Feature | TypeScript | RedScript needs |
521
+ |---|---|---|
522
+ | Primitive types | `number`, `string`, `boolean`, `symbol`, `bigint`... | `int`, `bool`, `string`, `float` |
523
+ | Compound | union, intersection, conditional, mapped, template literal | `struct`, `enum`, `T[]` |
524
+ | MC-specific | — | `selector<T>`, `BlockPos`, `void` |
525
+ | Generics | higher-kinded, infer, conditional | simple `T<U>` instantiation |
526
+ | Complexity | Turing-complete type system | intentionally simple |
527
+
528
+ Embedding tsc's type checker means inheriting `never`, `unknown`, conditional
529
+ types, `infer`, mapped types — none of which are useful on the MC target. A
530
+ lightweight structural type checker custom-built for the above set is smaller,
531
+ faster, and easier to extend with MC-specific rules.
532
+
533
+ ### What to borrow from TypeScript (syntax conventions only)
534
+
535
+ Keep the source syntax **familiar to TypeScript developers** without binding to tsc:
536
+
537
+ ```redscript
538
+ // These match TypeScript conventions — keep them
539
+ let x: int = 0;
540
+ const MAX: int = 100;
541
+ fn add(a: int, b: int): int { return a + b; }
542
+ struct Vec2 { x: int; y: int; }
543
+ impl Vec2 {
544
+ fn length(self): int { ... }
545
+ }
546
+ type Callback = (x: int) => void; // function type syntax
547
+ ```
548
+
549
+ ```redscript
550
+ // MC-specific extensions — keep them as-is, do not force into TS grammar
551
+ @tick fn _update() { ... } // decorator-style annotation
552
+ foreach (p in @a[tag=foo]) at @s { } // MC selector iteration
553
+ let s: selector<entity> = @e[...]; // generic selector type
554
+ kill(@e[tag=screen]); // MC command as builtin call
555
+ particle("end_rod", ^px, ^py, ^5, ...) // caret/tilde coordinates
556
+ ```
557
+
558
+ The rule: **syntax form follows TypeScript; semantics follow Minecraft.**
559
+
560
+ ### IDE support: implement LSP, not a tsc plugin
561
+
562
+ For real IntelliSense (completions, hover types, go-to-definition), the correct
563
+ path is a Language Server Protocol implementation:
564
+
565
+ ```
566
+ redscript-lsp
567
+ ├── parse .mcrs → typed AST
568
+ ├── type inference + error diagnostics
569
+ ├── completions: builtin names, selector attributes, struct fields
570
+ ├── hover: type info, MC command documentation
571
+ └── go-to-definition: cross-file symbol resolution
572
+ ```
573
+
574
+ LSP decouples the language server from the editor: VS Code, Neovim, Helix,
575
+ Zed, and any LSP-capable editor get support from one implementation.
576
+ A tsc plugin would be harder, VS Code-only, and still require all the same
577
+ semantic analysis.
578
+
579
+
580
+ ---
581
+
582
+ ## MC Compilation Target: Computational Commands
583
+
584
+ This section covers the MC commands that actually participate in computation —
585
+ not side-effect commands like `particle`, `summon`, `say`, `playsound`, etc.
586
+ Every operation in the IR must ultimately map to one or more of these.
587
+
588
+ ### Scoreboard: the "CPU registers" of MC
589
+
590
+ Scoreboard objectives hold named fake-player slots, each storing one INT32.
591
+ This is the primary computational medium.
592
+
593
+ ```
594
+ # Initialization
595
+ scoreboard objectives add <obj> dummy
596
+
597
+ # Write constant
598
+ scoreboard players set <fake_player> <obj> <value>
599
+
600
+ # Copy
601
+ scoreboard players operation $dst <obj> = $src <obj>
602
+
603
+ # Arithmetic (all in-place, 2-address)
604
+ scoreboard players operation $dst <obj> += $src <obj> # add
605
+ scoreboard players operation $dst <obj> -= $src <obj> # sub
606
+ scoreboard players operation $dst <obj> *= $src <obj> # mul
607
+ scoreboard players operation $dst <obj> /= $src <obj> # integer div (truncates toward zero)
608
+ scoreboard players operation $dst <obj> %= $src <obj> # mod (sign follows dividend)
609
+
610
+ # Min / max
611
+ scoreboard players operation $dst <obj> < $src <obj> # dst = min(dst, src)
612
+ scoreboard players operation $dst <obj> > $src <obj> # dst = max(dst, src)
613
+
614
+ # Swap
615
+ scoreboard players operation $a <obj> >< $b <obj>
616
+ ```
617
+
618
+ **Constraints:**
619
+ - INT32 only. No float, no 64-bit.
620
+ - Division is truncated toward zero (Java `int` semantics).
621
+ - No bitwise operations. XOR/AND/OR must be emulated with arithmetic.
622
+ - No comparison that produces a value — comparisons only appear in `execute if score`.
623
+
624
+ ### `execute store result score` — bridge from commands to scores
625
+
626
+ Captures the integer result of a command into a score slot:
627
+
628
+ ```
629
+ execute store result score $dst <obj> run <command>
630
+ ```
631
+
632
+ Used for:
633
+ - Reading entity NBT: `run data get entity @s Health 1`
634
+ - Reading storage: `run data get storage <ns> <path> <scale>`
635
+ - Capturing command success: `execute store success score $dst <obj> run ...`
636
+ - Returning from a function: `run function <ns>:<fn>` (captures `return` value)
637
+
638
+ ### `execute if/unless score` — the only conditional
639
+
640
+ All control flow in the compiled output is expressed with score comparisons:
641
+
642
+ ```
643
+ # Range check (most common — if score matches an integer range)
644
+ execute if score $x <obj> matches <N> run function <ns>:then_block
645
+ execute if score $x <obj> matches <N>..<M> run function <ns>:range_block
646
+ execute if score $x <obj> matches ..<N> run function <ns>:le_block
647
+
648
+ # Two-operand comparison
649
+ execute if score $a <obj> = $b <obj> run ... # a == b
650
+ execute if score $a <obj> < $b <obj> run ... # a < b
651
+ execute unless score $a <obj> = $b <obj> run ... # a != b
652
+ ```
653
+
654
+ **Why `matches` vs two-operand:**
655
+ - `matches N..` is cheaper than `= $const` (no extra fake-player needed).
656
+ - `matches 1..` is the canonical boolean-true check.
657
+ - Two-operand form needed for dynamic comparisons (`a < b` where both vary).
658
+
659
+ ### NBT Storage: heap memory
660
+
661
+ NBT storage (`data storage <ns>:<path>`) is the only persistent structured
662
+ memory available. It holds typed NBT values (int, double, string, list, compound).
663
+
664
+ ```
665
+ # Write literal
666
+ data modify storage <ns> <path> set value <nbt_literal>
667
+
668
+ # Copy between paths
669
+ data modify storage <ns> <dst_path> set from storage <ns> <src_path>
670
+
671
+ # Read into score (with optional scale factor)
672
+ execute store result score $dst <obj> run data get storage <ns> <path> 1
673
+
674
+ # Write score into storage (with scale: useful for float conversion)
675
+ execute store result storage <ns> <path> int 1 run scoreboard players get $src <obj>
676
+ execute store result storage <ns> <path> double 0.01 run scoreboard players get $src <obj>
677
+ # → stores (score × 0.01) as a double; e.g. score=975 → NBT 9.75d
678
+ ```
679
+
680
+ **Scale factor:**
681
+ The `<scale>` in `data get` / `execute store result storage` is a multiplier
682
+ applied on read/write. This is the only way to convert between integer scores
683
+ and fractional NBT values (used for float-coordinate macro parameters).
684
+
685
+ ### Array indexing via NBT
686
+
687
+ Static index (compile-time constant):
688
+
689
+ ```
690
+ execute store result score $dst <obj> run data get storage rs:heap array[5] 1
691
+ data modify storage rs:heap array[3] set value 42
692
+ ```
693
+
694
+ Dynamic index (runtime variable) — requires MC 1.20.2+ macros:
695
+
696
+ ```
697
+ # Step 1: write index into macro args storage
698
+ execute store result storage rs:macro_args i int 1 run scoreboard players get $idx <obj>
699
+
700
+ # Step 2: call macro function
701
+ function ns:_read_array with storage rs:macro_args
702
+
703
+ # Step 3: inside _read_array.mcfunction (macro function)
704
+ $execute store result score $ret <obj> run data get storage rs:heap array[$(i)] 1
705
+ ```
706
+
707
+ **RedScript `storage_get_int` / `storage_set_int` builtins compile to exactly this pattern.**
708
+
709
+ ### MC 1.20.2+ Function Macros
710
+
711
+ A function file that contains `$(key)` substitutions must be called with
712
+ `function <ns>:<fn> with storage <ns>:<macro_storage>`.
713
+ Any line containing `$(...)` must begin with `$`.
714
+
715
+ ```
716
+ # Caller: populate rs:macro_args, then call
717
+ execute store result storage rs:macro_args px double 0.01 run scoreboard players get $px_int <obj>
718
+ execute store result storage rs:macro_args py double 0.01 run scoreboard players get $py_int <obj>
719
+ function rsdemo:_draw with storage rs:macro_args
720
+
721
+ # Inside _draw.mcfunction (macro function):
722
+ $particle minecraft:end_rod ^$(px) ^$(py) ^5 0.02 0.02 0.02 0 10
723
+ ```
724
+
725
+ **What macros unlock:**
726
+ - Dynamic array indexing (above)
727
+ - Dynamic coordinates in `particle`, `setblock`, `tp`, `fill`, etc.
728
+ - Dynamic entity selectors and NBT paths
729
+
730
+ **Constraint:** Macro substitution is string interpolation at the command level.
731
+ The substituted value must be a valid literal for that position (integer, float,
732
+ coordinate, selector string). No arithmetic is performed during substitution.
733
+
734
+ ### `function` and `return`: call graph and early exit
735
+
736
+ ```
737
+ # Unconditional call
738
+ function <ns>:<path>
739
+
740
+ # Conditional call (the compiled form of if/else branches)
741
+ execute if score $cond <obj> matches 1.. run function <ns>:then_0
742
+ execute if score $cond <obj> matches ..0 run function <ns>:else_0
743
+
744
+ # Macro function call
745
+ function <ns>:<path> with storage <ns>:<macro_storage>
746
+
747
+ # Return a value (MC 1.20.3+)
748
+ return 42
749
+ return run scoreboard players get $x <obj>
750
+
751
+ # Early return (MC 1.20.2+, exits current function immediately)
752
+ return 0
753
+ ```
754
+
755
+ `return run <cmd>` stores the command's result as the function's return value,
756
+ readable via `execute store result score $ret <obj> run function ...`.
757
+
758
+ ### Summary: IR operation → MC command mapping
759
+
760
+ | IR operation | MC command |
761
+ |---|---|
762
+ | `x = const N` | `scoreboard players set $x <obj> N` |
763
+ | `x = copy y` | `scoreboard players operation $x <obj> = $y <obj>` |
764
+ | `x = add y, z` | copy y→x, then `+= $z` |
765
+ | `x = sub y, z` | copy y→x, then `-= $z` |
766
+ | `x = mul y, z` | copy y→x, then `*= $z` |
767
+ | `x = div y, z` | copy y→x, then `/= $z` |
768
+ | `x = mod y, z` | copy y→x, then `%= $z` |
769
+ | `if x == 0` | `execute if score $x <obj> matches 0 run ...` |
770
+ | `if x > y` | `execute if score $x <obj> > $y <obj> run ...` |
771
+ | `x = array[i]` | macro: store i, call macro fn, read `$ret` |
772
+ | `array[i] = v` | macro: store i+v, call macro fn |
773
+ | `call fn(args...)` | set up params in `$p0..$pN`, `function <ns>:<fn>` |
774
+ | `call_macro fn(args...)` | store args in `rs:macro_args`, `function ... with storage` |
775
+ | `return x` | `scoreboard players operation $ret <obj> = $x <obj>` |
776
+
777
+
778
+ ---
779
+
780
+ ## Current Debugging & Tooling
781
+
782
+ ### CLI commands
783
+
784
+ ```
785
+ redscript compile <file> [-o <out>] [--namespace <ns>] [--scoreboard <obj>] [--no-dce] [--no-mangle]
786
+ redscript watch <dir> [-o <out>] [--namespace <ns>] [--hot-reload <url>]
787
+ redscript check <file>
788
+ redscript fmt <file> [file2 ...]
789
+ redscript repl
790
+ redscript generate-dts [-o <file>]
791
+ ```
792
+
793
+ ---
794
+
795
+ ### `redscript check` — local syntax checker
796
+
797
+ **What it does:** Parse + preprocess only. Exits 0 if the file is syntactically
798
+ valid, non-zero with a formatted error otherwise. Does **not** run the type
799
+ checker, lowering, or optimizer.
800
+
801
+ ```bash
802
+ redscript check examples/readme-demo.mcrs
803
+ # ✓ examples/readme-demo.mcrs is valid
804
+ ```
805
+
806
+ **Known bug:** `check` always passes `namespace = 'redscript'` hardcoded to the
807
+ parser, regardless of the filename or any `--namespace` flag. This means
808
+ namespace-sensitive parse errors (e.g. a symbol that happens to conflict with
809
+ the literal string `"redscript"`) may behave differently under `check` vs
810
+ `compile`. Fix: derive namespace from filename (same logic as `compile`) or
811
+ accept a `--namespace` flag.
812
+
813
+ Additionally, `check` only calls the parser — it does **not** call the type
814
+ checker (`TypeChecker`). Type errors silently pass `check` and only surface
815
+ during `compile`. The type checker itself is currently in "warn mode" (collects
816
+ errors but does not block compilation), so even `compile` does not hard-fail on
817
+ type errors.
818
+
819
+ **Redesign:** `check` should run: parse → name resolution → full type checking,
820
+ and exit non-zero on any diagnostic. The current "warn mode" type checker should
821
+ become "error mode".
822
+
823
+ ---
824
+
825
+ ### `redscript watch` + `--hot-reload` — live reload against a running server
826
+
827
+ Watch mode recompiles on every `.mcrs` file change in a directory, then
828
+ optionally POSTs to a hot-reload endpoint:
829
+
830
+ ```bash
831
+ redscript watch src/ -o ~/mc-test-server/world/datapacks/rsdemo \
832
+ --namespace rsdemo \
833
+ --hot-reload http://localhost:25570
834
+ ```
835
+
836
+ On each successful compile, it calls `POST <url>/reload`, which is expected
837
+ to trigger `/reload` on the MC server. This gives a **save → auto-deploy →
838
+ `/reload`** loop without switching to the game.
839
+
840
+ ```
841
+ Save .mcrs
842
+ → recompile (< 1 s)
843
+ → write .mcfunction files to datapack dir
844
+ → POST /reload
845
+ → server reloads datapack
846
+ → test in-game immediately
847
+ ```
848
+
849
+ The hot-reload server is a tiny HTTP listener that must be running alongside
850
+ the MC server. Currently this is a manual setup (run a small HTTP server that
851
+ calls RCON `/reload`).
852
+
853
+ **Known limitation:** watch mode compiles all `.mcrs` files in the directory on
854
+ every change, not just the changed file. For large projects this is wasteful.
855
+ Incremental compilation (track which files changed, only recompile affected
856
+ functions) is a future improvement.
857
+
858
+ ---
859
+
860
+ ### `redscript repl` — interactive expression evaluator
861
+
862
+ Starts a read-eval-print loop. Accepts RedScript expressions and statements,
863
+ compiles them, runs them through `MCRuntime` (the in-process scoreboard
864
+ simulator), and prints the result.
865
+
866
+ Useful for quickly testing arithmetic, `sin_fixed` values, or algorithm
867
+ correctness without deploying to a server.
868
+
869
+ ```
870
+ > let x: int = sin_fixed(45);
871
+ x = 707
872
+ > x * x + (cos_fixed(45) * cos_fixed(45) / 1000)
873
+ = 999649
874
+ ```
875
+
876
+ **Known limitation:** the REPL resets all state between expressions (no
877
+ persistent variable binding across lines). Calling stdlib functions that depend
878
+ on `@load` initialization (e.g. `sin_fixed` table load) may not work correctly
879
+ unless the REPL explicitly runs the load function first.
880
+
881
+ ---
882
+
883
+ ### `--no-mangle` flag — readable variable names for debugging
884
+
885
+ By default, IR variable names are mangled to short names (`$a`, `$b`, `$ad`...)
886
+ to keep scoreboard objective slot names short. With `--no-mangle`, the original
887
+ source variable names are preserved:
888
+
889
+ ```bash
890
+ redscript compile demo.mcrs -o /tmp/out --no-mangle
891
+ ```
892
+
893
+ Generated mcfunction uses `$phase __rsdemo` instead of `$c __rsdemo`, making
894
+ it possible to read the output and correlate with source.
895
+
896
+ ---
897
+
898
+ ### Sourcemap
899
+
900
+ Every compile outputs a `.map.json` file alongside the datapack:
901
+
902
+ ```
903
+ /tmp/rsdemo/rsdemo.map.json
904
+ ```
905
+
906
+ Maps each generated `.mcfunction` path back to the source `.mcrs` file and
907
+ line number. Currently used for error reporting. Future use: step-through
908
+ debugger that maps MC function calls back to source lines.
909
+
910
+ ---
911
+
912
+ ### `MCRuntime` — in-process MC simulator (used by tests)
913
+
914
+ `src/runtime/index.ts` implements a simulated MC execution environment:
915
+ - Scoreboard: fake-player → objective → INT32 value
916
+ - NBT storage: nested map of NBT values
917
+ - Function call stack: dispatches `function ns:path` to the compiled output
918
+ - `execute if/unless score`: evaluated against the simulated scoreboard
919
+ - `execute store result score ... run ...`: captures command return value
920
+
921
+ All 920 tests use `MCRuntime` to run compiled datapacks in-process without a
922
+ real MC server. This makes the test suite fast (< 35 s for all 920 tests) and
923
+ server-independent.
924
+
925
+ **What `MCRuntime` does not simulate:**
926
+ - Entity selectors (`@a`, `@e`, `@p`, `@s`) — tests must mock these
927
+ - World block state (`setblock`, `fill`) — not tracked
928
+ - Particle/sound/title commands — silently ignored
929
+ - Tick scheduling — tests call `@tick` functions manually
930
+
931
+ ---
932
+
933
+ ### Test server: Paper 1.21.4 at `~/mc-test-server`
934
+
935
+ For integration testing that requires real MC behavior (entity selectors,
936
+ actual particle rendering, boss bars, etc.):
937
+
938
+ ```bash
939
+ # Start
940
+ cd ~/mc-test-server
941
+ /opt/homebrew/opt/openjdk@21/bin/java -jar paper.jar --nogui
942
+
943
+ # Deploy a datapack
944
+ redscript compile examples/readme-demo.mcrs \
945
+ -o ~/mc-test-server/world/datapacks/rsdemo \
946
+ --namespace rsdemo
947
+
948
+ # In-game or via RCON
949
+ /reload
950
+ /function rsdemo:start
951
+ ```
952
+
953
+ Server details:
954
+ - Paper 1.21.4-232
955
+ - Port 25561
956
+ - Java: `/opt/homebrew/opt/openjdk@21/bin/java`
957
+ - Accessible via Tailscale at `100.73.231.27:25561`
958
+
959
+
960
+ ---
961
+
962
+ ## Language Semantics Design (Redesign Decisions)
963
+
964
+ ### Visibility & DCE: `export` replaces `@keep`
965
+
966
+ **Current:** `@keep` forces a function to survive DCE. Everything without `@keep`
967
+ is potentially eliminated if unreachable.
968
+
969
+ **Redesign:** Use `export` as the explicit public-API marker, matching TypeScript/JS conventions.
970
+
971
+ ```redscript
972
+ export fn spawn_wave() { ... } // public — never DCE'd
973
+ fn _helper(x: int): int { ... } // private — eliminated if unreachable
974
+
975
+ @tick fn _tick() { ... } // @tick implies export (referenced by tick.json)
976
+ @load fn _load() { ... } // @load implies export (referenced by load.json)
977
+ ```
978
+
979
+ Rules:
980
+ - `export` → never DCE'd; accessible from other datapacks / MC
981
+ - no `export` → private; DCE applies
982
+ - `@tick` / `@load` implicitly export the function and wire it into `tick.json` / `load.json`
983
+ - `@require_on_load` (current stdlib pragma) → absorbed into `@load` or library `export` semantics
984
+
985
+ `module library;` pragma stays: marks a file as a library (all exports are
986
+ available for import; nothing auto-runs at load time).
987
+
988
+ ---
989
+
990
+ ### Struct: value type, no heap, no references
991
+
992
+ `struct` is kept (not renamed to `class`) because the value-type semantics are
993
+ immediately obvious from the name — same as C/C++/Rust structs.
994
+
995
+ ```redscript
996
+ struct Vec2 {
997
+ x: int;
998
+ y: int;
999
+ }
1000
+
1001
+ impl Vec2 {
1002
+ fn length_sq(self): int {
1003
+ return self.x * self.x + self.y * self.y;
1004
+ }
1005
+ }
1006
+
1007
+ let v: Vec2 = Vec2 { x: 3, y: 4 };
1008
+ let d: int = v.length_sq(); // = 25
1009
+ ```
1010
+
1011
+ **Constraints (by MC target):**
1012
+ - No heap allocation. A `Vec2` is two scoreboard slots, not a pointer.
1013
+ - No references. `let a = v; a.x = 10` does **not** modify `v`.
1014
+ - No dynamic dispatch / vtables. Method calls are statically resolved at compile time.
1015
+ - No inheritance. Composition only.
1016
+ - Struct fields cannot be `string` (strings cannot live in scoreboard).
1017
+
1018
+ **Why not `class`?** "Class" implies heap allocation and reference semantics in
1019
+ most languages. Using `struct` sets the correct expectation: this is a named
1020
+ group of scoreboard slots, not a Java-style object.
1021
+
1022
+ ---
1023
+
1024
+ ### Macro functions: transparent to users
1025
+
1026
+ A **macro function** is one that uses a parameter as a dynamic MC coordinate or
1027
+ array index — positions where MC requires literal values but we want runtime
1028
+ substitution via the 1.20.2+ function macro mechanism.
1029
+
1030
+ ```redscript
1031
+ // User writes this — looks like a normal function:
1032
+ fn draw_pt(px: float, py: float) {
1033
+ particle("minecraft:end_rod", ^px, ^py, ^5, 0.02, 0.02, 0.02, 0.0, 10);
1034
+ }
1035
+ ```
1036
+
1037
+ The compiler detects that `px`/`py` appear in `^`-coordinate positions and
1038
+ automatically emits:
1039
+ 1. A macro function file (`$particle ... ^$(px) ^$(py) ...`)
1040
+ 2. A call site that writes args to `rs:macro_args` storage and calls
1041
+ `function ns:draw_pt with storage rs:macro_args`
1042
+
1043
+ **Users never write `$` or `with storage`** — the compiler handles it.
1044
+
1045
+ In the redesign, macro-function status can be auto-detected (current behavior)
1046
+ or explicitly annotated `@macro fn draw_pt(...)`. Auto-detection is simpler for
1047
+ users; explicit annotation makes it clearer in large codebases. Decision: keep
1048
+ auto-detection, but emit a diagnostic if a function is unexpectedly promoted to
1049
+ macro status (so users are aware).
1050
+
1051
+ ---
1052
+
1053
+ ### Error handling & diagnostics
1054
+
1055
+ **Goal:** report all errors in a file before stopping, not just the first one.
1056
+ This is critical for IDE integration (the language server must not crash on the
1057
+ first typo).
1058
+
1059
+ **Approach: panic-mode error recovery in the parser**
1060
+
1061
+ When the parser encounters an unexpected token, it:
1062
+ 1. Records the error with source span
1063
+ 2. Skips tokens until it finds a synchronization point: `fn`, `}`, `;`, `@tick`, `@load`, EOF
1064
+ 3. Resumes parsing from that point
1065
+
1066
+ This collects multiple independent errors before stopping:
1067
+
1068
+ ```
1069
+ Error at line 5: expected ':' but got '='
1070
+ Error at line 12: unknown type 'flot'
1071
+ Error at line 18: undefined variable 'phse'
1072
+ 3 errors found.
1073
+ ```
1074
+
1075
+ **Not doing incremental parsing.** Incremental parsing (re-parse only changed
1076
+ sections) is a separate project requiring a tree-sitter-style persistent parse
1077
+ tree. The benefit for RedScript's typical file sizes (< 500 lines) is minimal,
1078
+ and the implementation cost is high. Panic-mode recovery is sufficient for
1079
+ a good developer experience.
1080
+
1081
+ **Diagnostic severity levels:**
1082
+
1083
+ | Level | Use |
1084
+ |---|---|
1085
+ | `error` | Compilation fails. Type mismatch, undefined symbol, syntax error. |
1086
+ | `warning` | Compilation succeeds but something is suspicious. Unused variable, unreachable code. |
1087
+ | `hint` | Informational. Style suggestions, implicit conversions. |
1088
+
1089
+ **Current `TypeChecker` is in "warn mode"** (type errors do not block compilation).
1090
+ In the redesign, type errors are `error` level and do block compilation.
1091
+
1092
+ ---
1093
+
1094
+ ### Type system
1095
+
1096
+ #### Primitive types
1097
+
1098
+ | Type | Storage | Notes |
1099
+ |---|---|---|
1100
+ | `int` | scoreboard INT32 | All arithmetic. Range: −2³¹ to 2³¹−1 |
1101
+ | `float` | scoreboard INT32 (×1000) | Fixed-point. `1.5` stored as `1500`. Use `mulfix`/`divfix` for ×/÷ |
1102
+ | `bool` | scoreboard 0 or 1 | `true`=1, `false`=0 |
1103
+ | `string` | NBT string | Cannot do arithmetic. Only usable in command/NBT contexts |
1104
+ | `void` | — | Function returns nothing |
1105
+
1106
+ #### MC-specific types
1107
+
1108
+ | Type | Notes |
1109
+ |---|---|
1110
+ | `selector<entity>` | Not a runtime value. Only usable in `foreach` / command contexts |
1111
+ | `selector<player>` | Subtype of `selector<entity>` for player-only selectors |
1112
+ | `BlockPos` | Coordinate triple. Only usable in command contexts |
1113
+
1114
+ #### Compound types
1115
+
1116
+ | Type | Notes |
1117
+ |---|---|
1118
+ | `struct Foo { ... }` | Value type. Fields are independent scoreboard slots |
1119
+ | `int[]` | NBT integer array. Dynamic access requires macro |
1120
+ | `(a: int) => int` | Function type. Used for stdlib callbacks |
1121
+
1122
+ #### Key design decisions
1123
+
1124
+ **1. Nominal typing (not structural)**
1125
+
1126
+ Two structs with identical fields are NOT compatible:
1127
+
1128
+ ```redscript
1129
+ struct Vec2 { x: int; y: int; }
1130
+ struct Point { x: int; y: int; }
1131
+
1132
+ fn distance(a: Vec2, b: Vec2): int { ... }
1133
+ let p: Point = Point { x: 1, y: 2 };
1134
+ distance(p, p) // ERROR: Point is not Vec2
1135
+ ```
1136
+
1137
+ Rationale: `Vec2` and `Point` use different scoreboard slot names. Structural
1138
+ compatibility would require a copy — which must be explicit.
1139
+
1140
+ **2. No implicit type conversion**
1141
+
1142
+ ```redscript
1143
+ let x: int = 5;
1144
+ let y: float = x; // ERROR: use 'x as float' (×1000 implicit conversion)
1145
+ let z: float = x as float; // OK: z = 5000 internally
1146
+ ```
1147
+
1148
+ Rationale: `int → float` is a ×1000 multiply. Making it implicit hides a
1149
+ potentially significant operation and makes arithmetic bugs hard to find.
1150
+
1151
+ **3. No null**
1152
+
1153
+ Scoreboard slots are always initialized to 0. There is no null/undefined/None
1154
+ in RedScript. No nullable types, no optional chaining.
1155
+
1156
+ **4. No union types, no conditional types**
1157
+
1158
+ These require runtime type tags, which cost scoreboard slots and add dispatch
1159
+ overhead. Not worth it at MC scale.
1160
+
1161
+ **5. Simple generics only**
1162
+
1163
+ `selector<entity>` and `selector<player>` are essentially two distinct concrete
1164
+ types, not a generic in the full sense. Generic functions (e.g. stdlib
1165
+ `foreach<T>`) are specialized at each call site by the compiler — no runtime
1166
+ polymorphism.
1167
+
1168
+ #### Type inference
1169
+
1170
+ Local variables: `let x = 5` → `int`, `let x = 5.0` → `float`, `let x = true` → `bool`.
1171
+ Function return types: inferred from `return` statements if not annotated.
1172
+ Struct fields: must be explicitly typed.
1173
+
1174
+ **Implementation estimate:** ~800–1200 lines of TypeScript. Two weeks.
1175
+
1176
+ ---
1177
+
1178
+ ### Incremental compilation: explicitly deferred
1179
+
1180
+ **Decision: do not implement.** `watch` mode already recompiles in < 1 second
1181
+ for typical project sizes. The complexity of tracking a file-level dependency
1182
+ graph and invalidating only affected functions outweighs the benefit.
1183
+
1184
+ If project sizes grow to hundreds of files and compilation becomes noticeably
1185
+ slow, incremental compilation can be added as a separate pass: build a module
1186
+ dependency DAG, recompile only the subgraph invalidated by a file change.
1187
+ Architecture note for the future: keep module loading separated from lowering
1188
+ so that a cached module's IR can be reused without re-parsing.
1189
+
1190
+
1191
+ ---
1192
+
1193
+ ## MC Execution Budget & Coroutine Transform
1194
+
1195
+ ### The 65536 command budget
1196
+
1197
+ `maxCommandChainLength` (gamerule, default 65536) limits the **total number of
1198
+ commands executed per game tick**, summed across all function calls triggered
1199
+ in that tick. It is a tick budget, not a per-function call depth limit.
1200
+
1201
+ A separate limit (~512 levels) applies to nested function call depth (JVM stack).
1202
+ This rarely matters for compiled output, which is usually shallow chains of
1203
+ `execute if ... run function`.
1204
+
1205
+ **Practical implications:**
1206
+ - A loop of 1000 iterations × 10 commands/iteration = 10,000 commands. Safe.
1207
+ - A loop of 1000 iterations × 100 commands/iteration = 100,000 commands. Exceeds budget — only ~655 iterations actually run; the rest are silently dropped.
1208
+ - The budget can be raised by a server admin: `gamerule maxCommandChainLength 1000000`. Cannot be changed from within a datapack.
1209
+
1210
+ **Compiler response:** The redesigned compiler should statically estimate the
1211
+ command count of loops and emit a `warning` when the estimated count approaches
1212
+ the budget.
1213
+
1214
+ ---
1215
+
1216
+ ### Coroutine Transform: automatic tick-splitting
1217
+
1218
+ For computations that genuinely require more commands than one tick allows,
1219
+ the compiler can automatically transform a long-running function into a
1220
+ tick-spread state machine. This is the same transformation JavaScript engines
1221
+ apply to `async/await`, Python applies to `yield`, and C# applies to
1222
+ `yield return` — just targeting the MC tick scheduler instead of an event loop.
1223
+
1224
+ #### Usage (proposed syntax)
1225
+
1226
+ ```redscript
1227
+ @coroutine(batch=10) fn process_all() {
1228
+ for (let i: int = 0; i < 1000; i++) {
1229
+ do_work(i); // heavy per-iteration work
1230
+ }
1231
+ finish();
1232
+ }
1233
+ ```
1234
+
1235
+ `@coroutine(batch=N)` tells the compiler: "split this function's loops so that
1236
+ each tick advances at most N iterations". If `batch` is omitted, the compiler
1237
+ estimates it from the loop body's command count.
1238
+
1239
+ #### How the transform works
1240
+
1241
+ **Step 1: Find yield points**
1242
+
1243
+ In the MIR CFG, find all back edges (edges that jump to a dominator block —
1244
+ i.e., loop headers). A yield point is inserted at each back edge, triggered
1245
+ every `batch` iterations.
1246
+
1247
+ **Step 2: Liveness analysis at yield points**
1248
+
1249
+ Compute the set of variables live at each yield point. These variables must
1250
+ persist across ticks — they cannot stay in function-local temporary slots.
1251
+ They are promoted to persistent scoreboard slots (or NBT for arrays/structs).
1252
+
1253
+ In the example above, `i` is the only live variable at the loop's yield point.
1254
+
1255
+ **Step 3: Split the CFG into continuations**
1256
+
1257
+ Each segment between yield points becomes a separate function. A `pc`
1258
+ (program counter) scoreboard slot tracks which continuation runs next.
1259
+
1260
+ ```
1261
+ coroutine_state:
1262
+ i → $coro_i __ns (promoted temp)
1263
+ pc → $coro_pc __ns (program counter)
1264
+
1265
+ continuation_1 (loop body, batch iterations):
1266
+ for (batch_count = 0; batch_count < BATCH && i < 1000; batch_count++) {
1267
+ do_work(i)
1268
+ i++
1269
+ }
1270
+ if i >= 1000: pc = 2 # advance to finish()
1271
+ else: pc = 1 # resume loop next tick
1272
+ return # end this tick's work
1273
+
1274
+ continuation_2:
1275
+ finish()
1276
+ pc = -1 # done
1277
+ ```
1278
+
1279
+ **Step 4: Generate the dispatcher**
1280
+
1281
+ ```redscript
1282
+ @tick fn _coro_process_all_tick() {
1283
+ // generated — do not edit
1284
+ execute if score $coro_pc __ns matches 1 run function ns:_coro_cont_1
1285
+ execute if score $coro_pc __ns matches 2 run function ns:_coro_cont_2
1286
+ }
1287
+ ```
1288
+
1289
+ The original `process_all()` call site becomes:
1290
+ ```
1291
+ scoreboard players set $coro_pc __ns 1 # start from continuation_1
1292
+ scoreboard players set $coro_i __ns 0 # initialize i
1293
+ ```
1294
+
1295
+ #### Algorithm components
1296
+
1297
+ | Component | Technique | Est. size |
1298
+ |---|---|---|
1299
+ | Find yield points | Dominator tree + back-edge detection | ~100 lines |
1300
+ | Live variable analysis | Standard dataflow (backwards liveness) | ~150 lines |
1301
+ | CFG splitting | Insert `pc = N; return` at yield points | ~200 lines |
1302
+ | Variable promotion | Assign persistent slots to live vars at yields | ~100 lines |
1303
+ | Dispatch generation | `execute if score pc matches N` chain | ~50 lines |
1304
+ | Batch size estimation | Static command-count estimate per loop body | ~100 lines |
1305
+
1306
+ **Total:** ~700 lines. Approximately 3–4 weeks of focused implementation.
1307
+ Prerequisite: proper MIR CFG with dominator tree (Stage 3 of the new pipeline).
1308
+
1309
+ #### Placement in the pipeline
1310
+
1311
+ This transform runs in **Stage 4 (MIR optimization passes)** as an opt-in pass:
1312
+
1313
+ ```typescript
1314
+ const pipeline: Pass[] = [
1315
+ constantFold,
1316
+ copyProp,
1317
+ dce,
1318
+ ...(options.coroutine ? [coroutineTransform] : []), // opt-in
1319
+ destinationForwarding,
1320
+ blockMerge,
1321
+ ]
1322
+ ```
1323
+
1324
+ It is not run by default — only on functions annotated `@coroutine`. The
1325
+ compiler will warn if a non-annotated loop is estimated to exceed the tick
1326
+ budget, suggesting the user add `@coroutine`.
1327
+
1328
+ #### What the transform does NOT do
1329
+
1330
+ - Does not handle `return` with a value from inside a coroutine (complex; deferred).
1331
+ - Does not support nested coroutines (a `@coroutine` calling another `@coroutine`).
1332
+ - Does not handle exceptions (RedScript has none, so this is fine).
1333
+ - Does not parallelize — MC is single-threaded; `@tick` runs one coroutine step per tick.
1334
+
1335
+ #### Relationship to Timer and manual state machines
1336
+
1337
+ RedScript's `Timer` stdlib is a manually written version of this pattern.
1338
+ The coroutine transform automates what users currently write by hand
1339
+ when they need multi-tick computations.
1340
+
1341
+
1342
+ ---
1343
+
1344
+ ## MIR Instruction Set (Specification)
1345
+
1346
+ MIR is 3-address, versioned temporaries, explicit CFG.
1347
+ Every instruction produces at most one result into a fresh temporary.
1348
+
1349
+ ### Types
1350
+
1351
+ ```typescript
1352
+ // A temporary variable — unique within a function, named t0, t1, t2...
1353
+ type Temp = string
1354
+
1355
+ // An operand: either a temp or an inline constant
1356
+ type Operand =
1357
+ | { kind: 'temp', name: Temp }
1358
+ | { kind: 'const', value: number }
1359
+
1360
+ // A basic block identifier
1361
+ type BlockId = string
1362
+
1363
+ // Comparison operators (for cmp instruction)
1364
+ type CmpOp = 'eq' | 'ne' | 'lt' | 'le' | 'gt' | 'ge'
1365
+
1366
+ // NBT value types (for nbt_write)
1367
+ type NBTType = 'int' | 'double' | 'float' | 'long' | 'short' | 'byte'
1368
+ ```
1369
+
1370
+ ### Instructions
1371
+
1372
+ ```typescript
1373
+ type MIRInstr =
1374
+ // ── Constants & copies ──────────────────────────────────────────────────
1375
+ | { kind: 'const', dst: Temp, value: number }
1376
+ // dst = value
1377
+
1378
+ | { kind: 'copy', dst: Temp, src: Operand }
1379
+ // dst = src
1380
+
1381
+ // ── Integer arithmetic ──────────────────────────────────────────────────
1382
+ | { kind: 'add', dst: Temp, a: Operand, b: Operand }
1383
+ | { kind: 'sub', dst: Temp, a: Operand, b: Operand }
1384
+ | { kind: 'mul', dst: Temp, a: Operand, b: Operand }
1385
+ | { kind: 'div', dst: Temp, a: Operand, b: Operand } // truncated toward zero
1386
+ | { kind: 'mod', dst: Temp, a: Operand, b: Operand } // sign follows dividend
1387
+ | { kind: 'neg', dst: Temp, src: Operand } // dst = -src
1388
+
1389
+ // ── Comparison (result is 0 or 1) ────────────────────────────────────────
1390
+ | { kind: 'cmp', dst: Temp, op: CmpOp, a: Operand, b: Operand }
1391
+
1392
+ // ── Boolean logic ────────────────────────────────────────────────────────
1393
+ | { kind: 'and', dst: Temp, a: Operand, b: Operand }
1394
+ | { kind: 'or', dst: Temp, a: Operand, b: Operand }
1395
+ | { kind: 'not', dst: Temp, src: Operand }
1396
+
1397
+ // ── NBT storage ──────────────────────────────────────────────────────────
1398
+ | { kind: 'nbt_read',
1399
+ dst: Temp,
1400
+ ns: string, path: string,
1401
+ scale: number }
1402
+ // Compiles to: execute store result score $dst <obj> run data get storage ns path scale
1403
+
1404
+ | { kind: 'nbt_write',
1405
+ ns: string, path: string,
1406
+ type: NBTType, scale: number,
1407
+ src: Operand }
1408
+ // Compiles to: execute store result storage ns path type scale run scoreboard players get $src <obj>
1409
+ // or: data modify storage ns path set value <literal> (when src is const)
1410
+
1411
+ // ── Function calls ────────────────────────────────────────────────────────
1412
+ | { kind: 'call',
1413
+ dst: Temp | null,
1414
+ fn: string,
1415
+ args: Operand[] }
1416
+ // Regular function call. args are passed via $p0..$pN scoreboard slots.
1417
+
1418
+ | { kind: 'call_macro',
1419
+ dst: Temp | null,
1420
+ fn: string,
1421
+ args: { name: string, value: Operand, type: NBTType, scale: number }[] }
1422
+ // Macro function call. args written to rs:macro_args storage.
1423
+ // Compiles to: [store each arg] + function fn with storage rs:macro_args
1424
+
1425
+ | { kind: 'call_context',
1426
+ fn: string,
1427
+ subcommands: ExecuteSubcmd[] }
1428
+ // execute [subcommands] run function fn
1429
+ // No return value (execute context calls are void).
1430
+
1431
+ // ── Terminators (exactly one per basic block, must be last) ──────────────
1432
+ | { kind: 'jump', target: BlockId }
1433
+ // Unconditional jump.
1434
+
1435
+ | { kind: 'branch', cond: Operand, then: BlockId, else: BlockId }
1436
+ // Conditional jump. cond must be 0 or 1.
1437
+
1438
+ | { kind: 'return', value: Operand | null }
1439
+ // Return from function (null = void return).
1440
+ ```
1441
+
1442
+ ### Basic block and function structure
1443
+
1444
+ ```typescript
1445
+ interface MIRBlock {
1446
+ id: BlockId
1447
+ instrs: MIRInstr[] // non-terminator instructions
1448
+ term: MIRInstr // must be jump | branch | return
1449
+ preds: BlockId[] // predecessor block ids (for dataflow)
1450
+ }
1451
+
1452
+ interface MIRFunction {
1453
+ name: string
1454
+ params: { name: Temp, isMacroParam: boolean }[]
1455
+ blocks: MIRBlock[]
1456
+ entry: BlockId // entry block id (always 'entry')
1457
+ isMacro: boolean // true if any param is a macro param
1458
+ }
1459
+
1460
+ interface MIRModule {
1461
+ functions: MIRFunction[]
1462
+ namespace: string
1463
+ objective: string // scoreboard objective (default: __<namespace>)
1464
+ }
1465
+ ```
1466
+
1467
+ ### Execute subcommands (used in `call_context`)
1468
+
1469
+ ```typescript
1470
+ type ExecuteSubcmd =
1471
+ | { kind: 'as', selector: string } // as @e[tag=foo]
1472
+ | { kind: 'at', selector: string } // at @e[tag=foo]
1473
+ | { kind: 'at_self' } // at @s
1474
+ | { kind: 'positioned', x: string, y: string, z: string }
1475
+ | { kind: 'rotated', yaw: string, pitch: string }
1476
+ | { kind: 'in', dimension: string }
1477
+ | { kind: 'anchored', anchor: 'eyes' | 'feet' }
1478
+ | { kind: 'if_score', a: string, op: CmpOp, b: string }
1479
+ | { kind: 'unless_score', a: string, op: CmpOp, b: string }
1480
+ | { kind: 'if_matches', score: string, range: string }
1481
+ | { kind: 'unless_matches', score: string, range: string }
1482
+ ```
1483
+
1484
+ ---
1485
+
1486
+ ## LIR Instruction Set (Specification)
1487
+
1488
+ LIR is 2-address, MC-specific, typed nodes — no raw strings.
1489
+ Each LIR instruction maps 1:1 (or near) to one MC command.
1490
+
1491
+ ### Slot type
1492
+
1493
+ ```typescript
1494
+ // A scoreboard slot: fake-player name + objective
1495
+ interface Slot { player: string; obj: string }
1496
+ ```
1497
+
1498
+ ### Instructions
1499
+
1500
+ ```typescript
1501
+ type LIRInstr =
1502
+ // ── Scoreboard ───────────────────────────────────────────────────────────
1503
+ | { kind: 'score_set', dst: Slot, value: number }
1504
+ // scoreboard players set <dst.player> <dst.obj> value
1505
+
1506
+ | { kind: 'score_copy', dst: Slot, src: Slot }
1507
+ // scoreboard players operation <dst> = <src>
1508
+
1509
+ | { kind: 'score_add', dst: Slot, src: Slot } // +=
1510
+ | { kind: 'score_sub', dst: Slot, src: Slot } // -=
1511
+ | { kind: 'score_mul', dst: Slot, src: Slot } // *=
1512
+ | { kind: 'score_div', dst: Slot, src: Slot } // /=
1513
+ | { kind: 'score_mod', dst: Slot, src: Slot } // %=
1514
+ | { kind: 'score_min', dst: Slot, src: Slot } // < (min)
1515
+ | { kind: 'score_max', dst: Slot, src: Slot } // > (max)
1516
+ | { kind: 'score_swap', a: Slot, b: Slot } // ><
1517
+
1518
+ // ── Execute store ────────────────────────────────────────────────────────
1519
+ | { kind: 'store_cmd_to_score', dst: Slot, cmd: LIRInstr }
1520
+ // execute store result score <dst> run <cmd>
1521
+
1522
+ | { kind: 'store_score_to_nbt',
1523
+ ns: string, path: string, type: NBTType, scale: number,
1524
+ src: Slot }
1525
+ // execute store result storage <ns> <path> <type> <scale> run scoreboard players get <src>
1526
+
1527
+ | { kind: 'store_nbt_to_score',
1528
+ dst: Slot, ns: string, path: string, scale: number }
1529
+ // execute store result score <dst> run data get storage <ns> <path> <scale>
1530
+
1531
+ // ── NBT ──────────────────────────────────────────────────────────────────
1532
+ | { kind: 'nbt_set_literal', ns: string, path: string, value: string }
1533
+ // data modify storage <ns> <path> set value <value>
1534
+
1535
+ | { kind: 'nbt_copy', srcNs: string, srcPath: string, dstNs: string, dstPath: string }
1536
+ // data modify storage <dstNs> <dstPath> set from storage <srcNs> <srcPath>
1537
+
1538
+ // ── Control flow ─────────────────────────────────────────────────────────
1539
+ | { kind: 'call', fn: string }
1540
+ // function <fn>
1541
+
1542
+ | { kind: 'call_macro', fn: string, storage: string }
1543
+ // function <fn> with storage <storage>
1544
+
1545
+ | { kind: 'call_if_matches', fn: string, slot: Slot, range: string }
1546
+ // execute if score <slot> matches <range> run function <fn>
1547
+
1548
+ | { kind: 'call_unless_matches', fn: string, slot: Slot, range: string }
1549
+
1550
+ | { kind: 'call_if_score',
1551
+ fn: string, a: Slot, op: CmpOp, b: Slot }
1552
+ // execute if score <a> <op> <b> run function <fn>
1553
+
1554
+ | { kind: 'call_unless_score', fn: string, a: Slot, op: CmpOp, b: Slot }
1555
+
1556
+ | { kind: 'call_context', fn: string, subcommands: ExecuteSubcmd[] }
1557
+ // execute [subcommands] run function <fn>
1558
+
1559
+ | { kind: 'return_value', slot: Slot }
1560
+ // scoreboard players operation $ret <obj> = <slot> (then implicit return)
1561
+
1562
+ // ── Macro line ────────────────────────────────────────────────────────────
1563
+ | { kind: 'macro_line', template: string }
1564
+ // A line starting with $ in a macro function.
1565
+ // template uses $(param) substitutions: e.g. "$particle end_rod ^$(px) ^$(py) ^5 ..."
1566
+
1567
+ // ── Arbitrary MC command (for builtins not covered above) ─────────────────
1568
+ | { kind: 'raw', cmd: string }
1569
+ // Emitted verbatim. Use sparingly — prefer typed instructions.
1570
+ ```
1571
+
1572
+ ### LIR function structure
1573
+
1574
+ ```typescript
1575
+ interface LIRFunction {
1576
+ name: string
1577
+ instructions: LIRInstr[] // flat list (no blocks; control flow is via call_if_*)
1578
+ isMacro: boolean
1579
+ macroParams: string[] // names of $(param) substitution keys
1580
+ }
1581
+
1582
+ interface LIRModule {
1583
+ functions: LIRFunction[]
1584
+ namespace: string
1585
+ objective: string
1586
+ }
1587
+ ```
1588
+
1589
+ ---
1590
+
1591
+ ## `execute` Context Blocks in the Pipeline
1592
+
1593
+ `at @s {}`, `as @e[...] {}`, `positioned {}`, etc. are desugared in **Stage 2 (HIR)**.
1594
+
1595
+ ### HIR representation
1596
+
1597
+ ```typescript
1598
+ // In HIR, execute context blocks are a first-class statement:
1599
+ interface HIRExecuteBlock {
1600
+ kind: 'execute_block'
1601
+ subcommands: ExecuteSubcmd[]
1602
+ body: HIRStmt[]
1603
+ }
1604
+ ```
1605
+
1606
+ ### Stage 2 → Stage 3 lowering
1607
+
1608
+ The execute block body is **extracted into a helper function** during HIR → MIR lowering.
1609
+ Variables captured from the enclosing scope are passed as parameters.
1610
+
1611
+ ```redscript
1612
+ // Source
1613
+ as @e[tag=foo] at @s {
1614
+ let dx: int = target_x - my_x;
1615
+ deal_damage(dx);
1616
+ }
1617
+ ```
1618
+
1619
+ HIR:
1620
+ ```
1621
+ execute_block {
1622
+ subcommands: [as @e[tag=foo], at_self],
1623
+ body: [let dx = target_x - my_x, call deal_damage(dx)]
1624
+ }
1625
+ ```
1626
+
1627
+ MIR lowering produces:
1628
+ ```
1629
+ // Capture free variables as args
1630
+ call_context '_exec_helper_0', [as @e[tag=foo], at_self]
1631
+
1632
+ // New MIR function:
1633
+ fn _exec_helper_0(target_x: int, my_x: int):
1634
+ t0 = sub target_x, my_x
1635
+ call deal_damage [t0]
1636
+ return void
1637
+ ```
1638
+
1639
+ In LIR:
1640
+ ```
1641
+ call_context 'ns:_exec_helper_0' [as @e[tag=foo], at @s]
1642
+ ```
1643
+
1644
+ Emitted mcfunction:
1645
+ ```
1646
+ execute as @e[tag=foo] at @s run function ns:_exec_helper_0
1647
+ ```
1648
+
1649
+ ### Nested execute blocks
1650
+
1651
+ Nested `as ... at ... { as ... { } }` blocks become nested helper functions.
1652
+ The compiler generates unique names (`_exec_0`, `_exec_1`, ...) in the order
1653
+ they appear in the source.
1654
+
1655
+ ---
1656
+
1657
+ ## Migration Strategy: Implementing the New Compiler
1658
+
1659
+ ### Approach: parallel implementation on a feature branch
1660
+
1661
+ Do **not** refactor in-place. The current compiler is working and serving users.
1662
+ Implement the new pipeline in a separate branch (`refactor/pipeline-v2`) until
1663
+ it passes 920/920 tests, then merge and delete the old code.
1664
+
1665
+ ### Directory structure during migration
1666
+
1667
+ ```
1668
+ src/ (current compiler — untouched during refactor)
1669
+ src2/ (new compiler — staged implementation)
1670
+ parser/ → copy of existing src/parser/ (Stage 1, keep as-is)
1671
+ lexer/ → copy of existing src/lexer/ (Stage 1, keep as-is)
1672
+ ast/ → copy of existing src/ast/ types
1673
+ hir/ → Stage 2: HIR types + AST→HIR lowering
1674
+ mir/ → Stage 3: MIR types + HIR→MIR lowering
1675
+ optimizer/ → Stage 4: MIR passes (new, 3-address aware)
1676
+ lir/ → Stage 5+6: LIR types + MIR→LIR + LIR passes
1677
+ emit/ → Stage 7: LIR → .mcfunction
1678
+ compile.ts → top-level entry point
1679
+ cli.ts → wire up to existing CLI
1680
+ ```
1681
+
1682
+ ### Progressive test coverage: lights up as stages complete
1683
+
1684
+ The 920 e2e tests only reach 920/920 when the full pipeline (all 7 stages) is
1685
+ complete. Do not expect them to be green mid-refactor. Each stage has its own
1686
+ appropriate test criterion:
1687
+
1688
+ | Stage | What to test | 920 e2e? |
1689
+ |---|---|---|
1690
+ | Stage 1 (parser/lexer) | Existing parser unit tests | irrelevant (no codegen) |
1691
+ | Stage 2 (HIR) | HIR unit tests: check desugaring of for/ternary/+= etc. Verify HIR is well-formed. | ✗ |
1692
+ | Stage 3 (MIR) | MIR unit tests: known patterns produce expected 3-address sequences. MIR verifier passes. | ✗ |
1693
+ | Stage 4 (optimizer) | Optimizer unit tests: input MIR → expected output MIR for each pass. | ✗ |
1694
+ | Stage 5 (LIR) | LIR unit tests: MIR→LIR lowering of specific patterns. Simple programs start producing valid mcfunction. | partial |
1695
+ | Stage 6 (LIR opt) | More programs produce correct output. Count improves vs Stage 5 baseline. | growing |
1696
+ | Stage 7 (emit) | 920/920 e2e tests pass. ✅ | ✅ |
1697
+
1698
+ **Rule:** write and pass the unit tests for a stage before moving to the next.
1699
+ Adapt as needed — the design is a guide, not a contract. If a stage boundary
1700
+ turns out to be wrong, merge stages or split them. The goal is green tests and
1701
+ maintainable code, not architectural purity.
1702
+
1703
+ ### Test harness
1704
+
1705
+ The existing 920 tests use `MCRuntime` and test final behavior, not IR internals.
1706
+ They are the primary regression suite and must pass at every stage gate.
1707
+
1708
+ Add **IR-level unit tests** for stages 2–6 independently:
1709
+ - HIR tests: check AST→HIR desugaring (for→while, ternary expansion, etc.)
1710
+ - MIR tests: check instruction sequences for known patterns
1711
+ - Optimizer tests: input MIR → expected output MIR for each pass
1712
+ - LIR tests: check MIR→LIR lowering of specific instruction patterns
1713
+
1714
+ These new tests live in `src2/__tests__/`.
1715
+
1716
+ ### What to tell Claude
1717
+
1718
+ When handing this to Claude for implementation:
1719
+ 1. Provide this document as context
1720
+ 2. Implement one stage at a time, in order (1 → 2 → 3 → 4 → 5 → 6 → 7)
1721
+ 3. Write the verifier for each stage before the next stage begins
1722
+ 4. Keep `npm test` (920 tests) passing at every stage gate
1723
+ 5. Commit after each stage gate passes
1724
+ 6. Do not start Stage N+1 until Stage N gate is green
1725
+
1726
+
1727
+ ---
1728
+
1729
+ ## Standard Library Redesign
1730
+
1731
+ The current stdlib was built incrementally and has several design problems.
1732
+ The refactor is an opportunity to fix them properly.
1733
+
1734
+ ### Current problems
1735
+
1736
+ **1. Naming inconsistency**
1737
+
1738
+ ```
1739
+ sin_fixed cos_fixed sqrt_fixed ← _fixed suffix
1740
+ mulfix divfix ← no underscore, no _fixed
1741
+ lerp clamp ← takes int but semantics are fixed-point
1742
+ ```
1743
+
1744
+ No unified convention. A user reading code cannot tell which functions operate
1745
+ on fixed-point values without reading the docs.
1746
+
1747
+ **2. No type enforcement for fixed-point values**
1748
+
1749
+ Every fixed-point function takes and returns `int`. There is nothing stopping
1750
+ a user from passing a raw integer (e.g. pixel coordinate `px = 10`) to
1751
+ `mulfix` where a fixed-point value (e.g. `sin_fixed(45) = 707`) was expected.
1752
+ The compiler cannot catch this. Silent wrong-answer bugs.
1753
+
1754
+ **3. `mulfix` overflow**
1755
+
1756
+ ```redscript
1757
+ fn mulfix(a: int, b: int) -> int { return a * b / 1000; }
1758
+ ```
1759
+
1760
+ `a * b` is a scoreboard `*=` operation, which wraps at INT32 (~2.1 billion).
1761
+ If `a = 50000` (= 50.0 in fixed-point) and `b = 50000`, then
1762
+ `a * b = 2,500,000,000` which overflows. Result is silently wrong.
1763
+ Safe range: both inputs < ~46340 (√2³¹ ≈ 46341).
1764
+
1765
+ **4. Vector functions: flat parameters, split return values**
1766
+
1767
+ ```redscript
1768
+ // Current: flat args, two separate functions for one 2D result
1769
+ fn normalize2d_x(x: int, y: int) -> int { ... }
1770
+ fn normalize2d_y(x: int, y: int) -> int { ... }
1771
+ ```
1772
+
1773
+ No struct-based API. Every 2D/3D operation takes 2–6 separate int parameters.
1774
+ `normalize2d_x` / `normalize2d_y` is especially bad — one logical operation
1775
+ split into two functions that each recompute the length independently.
1776
+
1777
+ Internal overflow: `x * 1000000` before dividing by length — overflows if
1778
+ either component > ~2147 (documented but easy to miss).
1779
+
1780
+ **5. `lerp(a, b, t)` mixes int and fixed-point semantics**
1781
+
1782
+ `lerp(0, 100, 500)` = 50. The `t` parameter is fixed-point (500 = 0.5),
1783
+ but `a` and `b` are plain integers. This is logically inconsistent.
1784
+ If everything were `float`, it would just be `lerp(0.0, 100.0, 0.5) = 50.0`.
1785
+
1786
+ ---
1787
+
1788
+ ### Root cause: no `float` type in the old compiler
1789
+
1790
+ All of the above stem from the same source: without a first-class `float` type,
1791
+ there is no way to distinguish `500` (the integer) from `500` (= 0.5 in ×1000
1792
+ fixed-point). The programmer must track the scale mentally. The API carries
1793
+ that burden in naming conventions (`_fixed` suffix) instead of in types.
1794
+
1795
+ ---
1796
+
1797
+ ### New stdlib design: `float` type does the heavy lifting
1798
+
1799
+ With the new type system, `float` = INT32 ×1000 fixed-point, enforced by the
1800
+ compiler. Arithmetic rules:
1801
+
1802
+ | Operation | Left | Right | Result | Codegen |
1803
+ |---|---|---|---|---|
1804
+ | `a + b` | float | float | float | `+=` (no correction) |
1805
+ | `a - b` | float | float | float | `-=` |
1806
+ | `a * b` | float | float | float | `*=` then `/= 1000` (mulfix) |
1807
+ | `a / b` | float | float | float | `*= 1000` then `/=` (divfix) |
1808
+ | `a * n` | float | int | float | `*=` (scale by integer, no correction) |
1809
+ | `a / n` | float | int | float | `/=` |
1810
+ | `int as float` | int → float | — | — | `*= 1000` |
1811
+ | `float as int` | float → int | — | — | `/= 1000` |
1812
+
1813
+ The user never writes `mulfix` or `divfix` — those become `*` and `/` on `float`.
1814
+
1815
+ ---
1816
+
1817
+ ### New stdlib API
1818
+
1819
+ #### math module
1820
+
1821
+ ```redscript
1822
+ // Integer math (unchanged semantics, cleaned up)
1823
+ fn abs(x: int): int
1824
+ fn sign(x: int): int // -1, 0, 1
1825
+ fn min(a: int, b: int): int
1826
+ fn max(a: int, b: int): int
1827
+ fn clamp(x: int, lo: int, hi: int): int
1828
+ fn pow_int(base: int, exp: int): int
1829
+ fn gcd(a: int, b: int): int
1830
+ fn lcm(a: int, b: int): int
1831
+ fn isqrt(n: int): int // integer square root
1832
+
1833
+ // Fixed-point math (now properly typed)
1834
+ fn sin(deg: int): float // was sin_fixed; returns float (×1000)
1835
+ fn cos(deg: int): float // was cos_fixed
1836
+ fn sqrt(x: float): float // was sqrt_fixed; input & output float
1837
+ fn abs_f(x: float): float // float abs
1838
+ fn clamp_f(x: float, lo: float, hi: float): float
1839
+ fn lerp(a: float, b: float, t: float): float // t in [0.0, 1.0] = [0, 1000]
1840
+ fn map(x: float, in_lo: float, in_hi: float, out_lo: float, out_hi: float): float
1841
+ fn smoothstep(t: float): float // t in [0.0, 1.0], was smoothstep(lo, hi, x)
1842
+ fn smootherstep(t: float): float
1843
+ fn atan2(y: float, x: float): int // returns degrees (int, 0-359)
1844
+ ```
1845
+
1846
+ Removed from public API: `mulfix`, `divfix` — these become `*` and `/` on `float`.
1847
+
1848
+ **`smoothstep` interface change:** `smoothstep(lo, hi, x)` → `smoothstep(t)` where
1849
+ caller normalizes `t` first using `map`. This is simpler and more composable.
1850
+
1851
+ #### vec module (struct-based)
1852
+
1853
+ ```redscript
1854
+ struct Vec2 { x: float; y: float; }
1855
+ struct Vec3 { x: float; y: float; z: float; }
1856
+
1857
+ impl Vec2 {
1858
+ fn add(self, other: Vec2): Vec2
1859
+ fn sub(self, other: Vec2): Vec2
1860
+ fn scale(self, s: float): Vec2 // scalar multiply
1861
+ fn dot(self, other: Vec2): float
1862
+ fn length_sq(self): float // x*x + y*y (no sqrt, no overflow)
1863
+ fn length(self): float // sqrt of length_sq
1864
+ fn dist(self, other: Vec2): float
1865
+ fn manhattan(self, other: Vec2): float
1866
+ fn chebyshev(self, other: Vec2): float
1867
+ fn lerp(self, other: Vec2, t: float): Vec2
1868
+ }
1869
+ ```
1870
+
1871
+ **Multi-value return problem:** `normalize()` must return a `Vec2` (two values),
1872
+ but MC functions return one INT32. Options:
1873
+
1874
+ 1. **Two functions:** `fn normalize_x(self): float` + `fn normalize_y(self): float`
1875
+ (current approach — ugly but simple)
1876
+
1877
+ 2. **Write to output parameter via well-known storage:**
1878
+ ```redscript
1879
+ fn normalize(self): Vec2
1880
+ // compiler generates: compute into $ret_x / $ret_y slots
1881
+ // call site: reads those slots back into the dest Vec2
1882
+ ```
1883
+ Requires compiler support for multi-slot struct return values.
1884
+
1885
+ 3. **Explicit output parameter:**
1886
+ ```redscript
1887
+ fn normalize(self, out_x: int, out_y: int) // caller passes NBT paths
1888
+ ```
1889
+ Awkward in user code.
1890
+
1891
+ **Decision: implement option 2 for the new compiler.** Struct return values
1892
+ use per-field "return slots" (`$ret_<field> __ns`). The caller reads them after
1893
+ the call. The compiler generates this transparently.
1894
+
1895
+ #### timer module (multi-instance)
1896
+
1897
+ ```redscript
1898
+ struct Timer { _id: int; }
1899
+
1900
+ impl Timer {
1901
+ @static fn new(): Timer // allocates a slot, returns id
1902
+ fn start(self, ticks: int) // begin countdown
1903
+ fn tick(self) // call in @tick — decrements if active
1904
+ fn is_done(self): bool
1905
+ fn cancel(self)
1906
+ }
1907
+ ```
1908
+
1909
+ Uses `_id`-scoped fake player names: `timer_0_ticks`, `timer_1_ticks`, etc.
1910
+ Breaks the single-instance limitation.
1911
+
1912
+ #### bigint module (unchanged API, internal cleanup)
1913
+
1914
+ BigInt API is already reasonable. Internal cleanup:
1915
+ - Use struct: `struct BigInt { _id: int }`
1916
+ - Slot-based NBT storage (already uses `rs:bigint`, extend to per-instance paths)
1917
+ - Add: `shift_left`, `mod_pow`, string conversion
1918
+
1919
+ ---
1920
+
1921
+ ### Overflow: permanent constraints
1922
+
1923
+ These cannot be eliminated — they are fundamental to INT32 scoreboard arithmetic.
1924
+
1925
+ | Operation | Safe range | Notes |
1926
+ |---|---|---|
1927
+ | `a * b` (int × int) | both < 46341 | INT32 overflow at 46341² |
1928
+ | `float * float` | both < 46.341 (= 46341 fixed-point) | same limit |
1929
+ | `normalize2d` | input components < ~2147 | intermediate `x * 1000000` |
1930
+ | `sin(deg) * r` (float × float) | r < 46341 | sin output ≤ 1000 |
1931
+
1932
+ The stdlib should document these clearly and provide overflow-safe variants
1933
+ (e.g. `length_sq` instead of `length` when the square is sufficient).
1934
+
1935
+
1936
+ ---
1937
+
1938
+ ## Testing During the Refactor (Addendum)
1939
+
1940
+ ### Write tests alongside implementation, not after
1941
+
1942
+ Each stage should produce its own tests as it is built.
1943
+ Do not batch all testing to the end.
1944
+
1945
+ ```
1946
+ Implement Stage N → write unit tests for Stage N → tests pass → commit → Stage N+1
1947
+ ```
1948
+
1949
+ Stage-local tests are small and focused:
1950
+ - HIR tests: "this source snippet produces this HIR node"
1951
+ - MIR tests: "this HIR function produces this 3-address sequence"
1952
+ - Optimizer tests: "this MIR input becomes this MIR output after pass X"
1953
+ - LIR tests: "this MIR instruction lowers to this LIR instruction"
1954
+
1955
+ The 920 e2e tests are a final integration check, not a substitute for unit tests.
1956
+
1957
+ ### Syntax changes will break many existing tests — that's fine
1958
+
1959
+ The new compiler introduces syntax changes that are not backward-compatible
1960
+ with the current `.mcrs` test fixtures. This is expected and acceptable.
1961
+
1962
+ Known breaking changes:
1963
+
1964
+ | Old syntax | New syntax |
1965
+ |---|---|
1966
+ | `fn foo(x: int) -> int` | `fn foo(x: int): int` |
1967
+ | `@keep fn foo()` | `export fn foo()` |
1968
+ | `struct Foo` + `impl Foo` (separate) | `struct Foo { ... impl ... }` or keep separate — TBD |
1969
+ | `use "path"` (current) | may stay the same |
1970
+ | `float` type was mostly `int` in practice | `float` is now distinct |
1971
+
1972
+ **Strategy for existing tests:**
1973
+ - Tests whose source syntax changes must be updated before they can run against the new compiler
1974
+ - Do not force the new compiler to accept old syntax for test compatibility
1975
+ - Update test fixtures as part of Stage 7 (when the full pipeline is wired up and tests are being migrated)
1976
+ - The old compiler in `src/` still exists until the refactor is complete — old tests can still run against it at any time
1977
+
1978
+ ### Test structure for the new compiler (`src2/__tests__/`)
1979
+
1980
+ ```
1981
+ src2/__tests__/
1982
+ hir/
1983
+ desugar.test.ts for → while transforms, ternary expansion, etc.
1984
+ execute-blocks.test.ts execute context block lowering to helper fns
1985
+ mir/
1986
+ arithmetic.test.ts int/float arithmetic sequences
1987
+ control-flow.test.ts if/while/loop → CFG branch structure
1988
+ calls.test.ts regular + macro function calls
1989
+ verify.test.ts MIR verifier catches malformed IR
1990
+ optimizer/
1991
+ constant-fold.test.ts
1992
+ copy-prop.test.ts
1993
+ dce.test.ts
1994
+ block-merge.test.ts
1995
+ lir/
1996
+ scoreboard.test.ts score_copy/add/etc lowering
1997
+ execute.test.ts call_if_matches, call_context lowering
1998
+ macro.test.ts macro_line generation
1999
+ e2e/
2000
+ *.test.ts updated versions of the 920 tests (staged migration)
2001
+ ```
2002
+
2003
+
2004
+ ---
2005
+
2006
+ ## MCRuntime Profiler
2007
+
2008
+ ### Motivation
2009
+
2010
+ `MCRuntime` already simulates full MC execution (scoreboard, NBT, function dispatch).
2011
+ Adding a profiling mode costs almost nothing and unlocks:
2012
+
2013
+ 1. **Stdlib tuning** — find the optimal lookup table resolution for trig functions
2014
+ 2. **Error analysis** — measure fixed-point accuracy vs JS float ground truth
2015
+ 3. **Coroutine BATCH calibration** — auto-estimate safe batch size from per-iteration command count
2016
+ 4. **Regression detection** — alert when a refactor increases command counts
2017
+
2018
+ ### Implementation
2019
+
2020
+ ```typescript
2021
+ interface ProfilingOptions {
2022
+ enabled: boolean
2023
+ trackPerFunction: boolean // per-function command counts
2024
+ groundTruth?: boolean // compare int results to JS float (for math functions)
2025
+ }
2026
+
2027
+ class MCRuntime {
2028
+ // existing: scoreboard, storage, function dispatch...
2029
+
2030
+ // profiling (opt-in, zero cost when disabled)
2031
+ private profiling?: ProfilingOptions
2032
+ commandCount = 0
2033
+ functionStats = new Map<string, { calls: number, totalCmds: number, maxDepth: number }>()
2034
+
2035
+ constructor(options?: { profiling?: ProfilingOptions }) { ... }
2036
+ }
2037
+ ```
2038
+
2039
+ Usage:
2040
+ ```typescript
2041
+ const rt = new MCRuntime({ profiling: { enabled: true, trackPerFunction: true } })
2042
+ rt.call('rsdemo:_draw', { px: 500, py: 707 })
2043
+ console.log(rt.functionStats)
2044
+ // Map {
2045
+ // 'rsdemo:_draw' → { calls: 1, totalCmds: 3, maxDepth: 1 }
2046
+ // }
2047
+ console.log(rt.commandCount) // 3
2048
+ ```
2049
+
2050
+ ### Error analysis for math functions
2051
+
2052
+ Compare MCRuntime output (integer fixed-point) against JS `Math` (IEEE 754 float):
2053
+
2054
+ ```typescript
2055
+ function analyzeError(fn: string, inputRange: number[], expected: (x: number) => number) {
2056
+ const rt = new MCRuntime()
2057
+ const errors: number[] = []
2058
+
2059
+ for (const x of inputRange) {
2060
+ const computed = rt.call(fn, x) // integer ×1000 result
2061
+ const truth = expected(x) * 1000 // JS float × 1000
2062
+ errors.push(Math.abs(computed - truth))
2063
+ }
2064
+
2065
+ return {
2066
+ maxError: Math.max(...errors),
2067
+ avgError: errors.reduce((a,b) => a+b) / errors.length,
2068
+ worstCase: inputRange[errors.indexOf(Math.max(...errors))],
2069
+ }
2070
+ }
2071
+
2072
+ // sin lookup table (91 entries, 1° resolution):
2073
+ analyzeError('math:sin', range(0, 360), x => Math.sin(x * Math.PI / 180))
2074
+ // → { maxError: 0.89, avgError: 0.31, worstCase: 1 }
2075
+
2076
+ // If we doubled to 181 entries (0.5° resolution), how much better?
2077
+ // → { maxError: 0.22, avgError: 0.08 } — but +90 storage entries, +~20 commands per call
2078
+ // Trade-off is now measurable, not guessed.
2079
+ ```
2080
+
2081
+ This makes stdlib decisions data-driven:
2082
+
2083
+ | Table size | Max error | Avg error | Commands/call |
2084
+ |---|---|---|---|
2085
+ | 91 entries (1°) | 0.89 | 0.31 | ~12 |
2086
+ | 181 entries (0.5°) | 0.22 | 0.08 | ~14 |
2087
+ | 361 entries (0.25°) | 0.055 | 0.02 | ~16 |
2088
+
2089
+ Pick the row that fits your application's accuracy budget.
2090
+
2091
+ ### Coroutine BATCH calibration
2092
+
2093
+ The profiler directly answers "how big can my BATCH be without exceeding 65536?":
2094
+
2095
+ ```typescript
2096
+ const rt = new MCRuntime({ profiling: { enabled: true } })
2097
+
2098
+ // Profile one iteration of the loop body
2099
+ rt.call('mymod:_loop_body_once', { i: 0 })
2100
+ const cmdsPerIter = rt.commandCount
2101
+
2102
+ // Safe BATCH = floor(budget / cmdsPerIter)
2103
+ const SAFE_BATCH = Math.floor(65536 / cmdsPerIter)
2104
+ console.log(`Recommended @coroutine(batch=${SAFE_BATCH})`)
2105
+ ```
2106
+
2107
+ The compiler can run this automatically during compilation when `@coroutine`
2108
+ is used without an explicit `batch=N` — profile the loop body in MCRuntime,
2109
+ compute the safe batch, insert it.
2110
+
2111
+ ### Placement
2112
+
2113
+ Profiling mode lives in `src2/runtime/index.ts` (or `src/runtime/` now).
2114
+ It is off by default and adds zero overhead to the 920 e2e tests.
2115
+
2116
+ New test file: `src2/__tests__/profiler/stdlib-accuracy.test.ts`
2117
+ Runs the error analysis for all math functions and asserts max error < threshold.
2118
+ This becomes a regression test: if a stdlib change degrades accuracy, the test fails.
2119
+
2120
+
2121
+ ---
2122
+
2123
+ ## Compile-Time Macros (`@comptime`)
2124
+
2125
+ ### Motivation
2126
+
2127
+ Lookup tables (sin, cos, atan2, noise permutation, easing curves) are currently
2128
+ generated by external Python scripts and pasted as literals into `.mcrs` files.
2129
+ Changing table resolution requires re-running the script. `@comptime` eliminates
2130
+ this friction.
2131
+
2132
+ ### Semantics
2133
+
2134
+ A function annotated `@comptime` runs **in the compiler** (TypeScript/JS runtime)
2135
+ rather than in Minecraft. It has access to exact floating-point math but no MC
2136
+ builtins. Its return value is embedded as a compile-time literal.
2137
+
2138
+ ```redscript
2139
+ // In stdlib/math.mcrs:
2140
+ @comptime fn _sin_table_data(resolution: int): int[] {
2141
+ // Runs in compiler — uses JS Math.sin (exact float)
2142
+ let table: int[] = [];
2143
+ for (let i = 0; i <= resolution; i++) {
2144
+ let deg = i as float * 90.0 / resolution as float;
2145
+ table.push(floor(Math.sin(deg * PI / 180.0) * 1000.0));
2146
+ }
2147
+ return table;
2148
+ }
2149
+
2150
+ // User-configurable resolution (compile-time constant):
2151
+ const SIN_RESOLUTION: comptime int = 91; // override in your datapack
2152
+
2153
+ @load fn _math_init() {
2154
+ // Compiler replaces this call with the literal array at compile time
2155
+ storage_set_array("math:tables", "sin", _sin_table_data(SIN_RESOLUTION));
2156
+ }
2157
+ ```
2158
+
2159
+ The emitted `_math_init.mcfunction` contains a single command with the literal
2160
+ array — zero runtime overhead.
2161
+
2162
+ ### Restrictions on `@comptime` functions
2163
+
2164
+ - No MC builtins (`particle`, `kill`, `scoreboard`, etc.)
2165
+ - No `@tick` / `@load` / `@keep` / `export`
2166
+ - No side effects (pure function only)
2167
+ - Arguments must be compile-time constants (literals or `comptime` variables)
2168
+ - Return types: `int`, `float`, `bool`, `string`, or arrays of these
2169
+
2170
+ ### Implementation
2171
+
2172
+ The compiler evaluates `@comptime` calls using a small tree-walking interpreter
2173
+ (~300–400 lines) that handles: arithmetic, boolean logic, array literals/push,
2174
+ for/while loops, if/else, and a whitelist of math builtins (`Math.sin`,
2175
+ `Math.cos`, `Math.sqrt`, `Math.floor`, `Math.ceil`, `Math.round`, `PI`, `E`).
2176
+
2177
+ This runs during Stage 2 (HIR lowering) — comptime calls are evaluated and
2178
+ replaced with their results before the HIR is constructed.
2179
+
2180
+ ### User-facing benefits
2181
+
2182
+ ```redscript
2183
+ // Tune sin table resolution for your use case:
2184
+ const SIN_RESOLUTION: comptime int = 181; // 0.5° resolution
2185
+
2186
+ // Generate a custom easing curve (Bezier-sampled, 64 points):
2187
+ @comptime fn _ease_out_cubic(n: int): int[] {
2188
+ let table: int[] = [];
2189
+ for (let i = 0; i < n; i++) {
2190
+ let t = i as float / n as float;
2191
+ let v = 1.0 - (1.0 - t) * (1.0 - t) * (1.0 - t);
2192
+ table.push(floor(v * 1000.0));
2193
+ }
2194
+ return table;
2195
+ }
2196
+ const EASE_TABLE: int[] = _ease_out_cubic(64);
2197
+ ```
2198
+
2199
+ ---
2200
+
2201
+ ## Safe Math: Default On
2202
+
2203
+ ### The `safe-math` mode
2204
+
2205
+ All `float * float` and `float / float` operations default to overflow-safe
2206
+ variants. This is the **default** in the new compiler; there is no flag to
2207
+ disable it (unlike the old `--no-dce`).
2208
+
2209
+ **Safe multiply** (`a * b` where both are `float`):
2210
+ ```
2211
+ # Fast (current, can overflow if |a| or |b| > 46341 in fixed-point):
2212
+ scoreboard players operation $dst __ns = $a __ns
2213
+ scoreboard players operation $dst __ns *= $b __ns
2214
+ scoreboard players operation $dst __ns /= 1000
2215
+
2216
+ # Safe default (costs ~4 extra commands, prevents overflow up to ±1,000,000):
2217
+ # Divide both by 32 before multiplying, then compensate
2218
+ scoreboard players operation $t __ns = $a __ns
2219
+ scoreboard players operation $t __ns /= 32
2220
+ scoreboard players operation $u __ns = $b __ns
2221
+ scoreboard players operation $u __ns /= 32
2222
+ scoreboard players operation $dst __ns = $t __ns
2223
+ scoreboard players operation $dst __ns *= $u __ns
2224
+ scoreboard players operation $dst __ns *= 1000
2225
+ scoreboard players operation $dst __ns /= 1024 # 32*32/1000 ≈ 1.024 correction
2226
+ ```
2227
+
2228
+ This extends the safe multiplication range from ±46,341 to approximately ±1,000,000
2229
+ in fixed-point (= ±1,000 real units), sufficient for all typical MC use cases.
2230
+
2231
+ ### Functions that always use safe logic (regardless of mode)
2232
+
2233
+ - `normalize2d` / `normalize3d` — internal `x * 1_000_000` has a ±2147 limit;
2234
+ reimplemented to avoid the intermediate, always safe
2235
+ - `length2d` / `length3d` — internal `x * x` safe up to ±46,341 (= ±46.341 units);
2236
+ acceptable for most MC distances; document the constraint
2237
+
2238
+ ### Profiler-verified: safe math adds ~4 commands per float multiply
2239
+
2240
+ The profiler can measure the exact command-count cost of safe vs fast math.
2241
+ Default is safe-on; users who need maximum performance and know their operand
2242
+ ranges can use explicit integer arithmetic instead of `float`.
2243
+