porffor 0.0.0-61de729 → 0.0.0-679c4ea

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,11 +1,10 @@
1
1
  # porffor
2
- a basic experimental wip *aot* optimizing js -> wasm engine/compiler/runtime in js. not serious/intended for (real) use. (this is a straight forward, honest readme)<br>
3
- age: ~1 month
2
+ a basic experimental wip *aot* optimizing js -> wasm/c engine/compiler/runtime in js. not serious/intended for (real) use. (this is a straight forward, honest readme)<br>
3
+ age: ~2 months
4
4
 
5
5
  ## design
6
6
  porffor is a very unique js engine, due a very different approach. it is seriously limited, but what it can do, it does pretty well. key differences:
7
7
  - 100% aot compiled *(not jit)*
8
- - everything is a number
9
8
  - no constant runtime/preluded code
10
9
  - least Wasm imports possible (only stdio)
11
10
 
@@ -18,6 +17,12 @@ porffor is mostly built from scratch, the only thing that is not is the parser (
18
17
  - no variables between scopes (except args and globals)
19
18
  - literal callees only in calls (eg `print()` works, `a = print; a()` does not)
20
19
 
20
+ ## rhemyn
21
+ rhemyn is porffor's own regex engine; it compiles literal regex to wasm bytecode aot (remind you of anything?). it is quite basic and wip. see [its readme](rhemyn/README.md) for more details.
22
+
23
+ ## 2c
24
+ 2c is porffor's own wasm -> c compiler, using generated wasm bytecode and internal info to generate specific and efficient/fast c code. no boilerplate/preluded code or required external files, just for cli binaries (not like wasm2c very much at all).
25
+
21
26
  ## supported
22
27
  see [optimizations](#optimizations) for opts implemented/supported.
23
28
 
@@ -77,6 +82,10 @@ these include some early (stage 1/0) and/or dead (last commit years ago) proposa
77
82
  - string concat (`+`) (eg `'a' + 'b'`)
78
83
  - truthy/falsy (eg `!'' == true`)
79
84
  - string comparison (eg `'a' == 'a'`, `'a' != 'b'`)
85
+ - nullish coalescing operator (`??`)
86
+ - `for...of` (arrays and strings)
87
+ - array member setting (`arr[0] = 2`, `arr[0] += 2`, etc)
88
+ - array constructor (`Array(5)`, `new Array(1, 2, 3)`)
80
89
 
81
90
  ### built-ins
82
91
 
@@ -89,7 +98,7 @@ these include some early (stage 1/0) and/or dead (last commit years ago) proposa
89
98
  - basic `eval` (literals only)
90
99
  - `Math.random()` using self-made xorshift128+ PRNG
91
100
  - some of `performance` (`now()`)
92
- - some of `Array.prototype` (`at`, `push`, `pop`, `shift`)
101
+ - some of `Array.prototype` (`at`, `push`, `pop`, `shift`, `fill`)
93
102
  - some of `String.prototype` (`at`, `charAt`, `charCodeAt`)
94
103
 
95
104
  ### custom
@@ -100,26 +109,35 @@ these include some early (stage 1/0) and/or dead (last commit years ago) proposa
100
109
  - intrinsic functions (see below)
101
110
  - inlining wasm via ``asm`...``\` "macro"
102
111
 
103
- ## soon todo
112
+ ## todo
113
+ no particular order and no guarentees, just what could happen soon™
114
+
104
115
  - arrays
105
- - member setting (`arr[0] = 2`)
106
116
  - more of `Array` prototype
107
117
  - arrays/strings inside arrays
118
+ - destructuring
108
119
  - strings
109
120
  - member setting
121
+ - objects
122
+ - basic object expressions (eg `{}`, `{ a: 0 }`)
123
+ - wasm
124
+ - *basic* wasm engine (interpreter) in js
110
125
  - more math operators (`**`, etc)
111
126
  - `do { ... } while (...)`
127
+ - rewrite `console.log` to work with strings/arrays
112
128
  - exceptions
113
- - `try { } finally {}`
129
+ - rewrite to use actual strings (optional?)
130
+ - `try { } finally { }`
114
131
  - rethrowing inside catch
115
132
  - optimizations
116
133
  - rewrite local indexes per func for smallest local header and remove unused idxs
117
134
  - smarter inline selection (snapshots?)
118
135
  - remove const ifs (`if (true)`, etc)
119
- - use data segments for initing arrays
120
136
 
121
- ## test262
122
- porffor can run test262 via some hacks/transforms which remove unsupported features whilst still doing the same asserts (eg simpler error messages using literals only). it currently passes >10% (see latest commit desc for latest and details). use `node test262` to test, it will also show a difference of overall results between the last commit and current results.
137
+ ## porfformance
138
+ *for the things it supports most of the time*, porffor is blazingly fast compared to most interpreters, and common engines running without JIT. for those with JIT, it is not that much slower like a traditional interpreter would be; mostly the same or a bit faster/slower depending on what.
139
+
140
+ ![Screenshot of comparison chart](https://github.com/CanadaHonk/porffor/assets/19228318/76c75264-cc68-4be1-8891-c06dc389d97a)
123
141
 
124
142
  ## optimizations
125
143
  mostly for reducing size. do not really care about compiler perf/time as long as it is reasonable. we do not use/rely on external opt tools (`wasm-opt`, etc), instead doing optimization inside the compiler itself creating even smaller code sizes than `wasm-opt` itself can produce as we have more internal information. (this also enables fast + small runtime use as a potential cursed jit in frontend).
@@ -135,15 +153,20 @@ mostly for reducing size. do not really care about compiler perf/time as long as
135
153
  - `i64.extend_i32_s`, `i32.wrap_i64` -> ``
136
154
  - `f64.convert_i32_u`, `i32.trunc_sat_f64_s` -> ``
137
155
  - `return`, `end` -> `end`
156
+ - change const, convert to const of converted valtype (eg `f64.const`, `i32.trunc_sat_f64_s -> `i32.const`)
138
157
  - remove some redundant sets/gets
139
158
  - remove unneeded single just used vars
140
159
  - remove unneeded blocks (no `br`s inside)
141
160
  - remove unused imports
161
+ - use data segments for initing arrays/strings
142
162
 
143
163
  ### wasm module
144
164
  - type cache/index (no repeated types)
145
165
  - no main func if empty (and other exports)
146
166
 
167
+ ## test262
168
+ porffor can run test262 via some hacks/transforms which remove unsupported features whilst still doing the same asserts (eg simpler error messages using literals only). it currently passes >10% (see latest commit desc for latest and details). use `node test262` to test, it will also show a difference of overall results between the last commit and current results.
169
+
147
170
  ## codebase
148
171
  - `compiler`: contains the compiler itself
149
172
  - `builtins.js`: all built-ins of the engine (spec, custom. vars, funcs)
@@ -164,6 +187,10 @@ mostly for reducing size. do not really care about compiler perf/time as long as
164
187
  - `info.js`: runs with extra info printed
165
188
  - `repl.js`: basic repl (uses `node:repl`)
166
189
 
190
+ - `rhemyn`: contains [rhemyn](#rhemyn) - the regex engine used by porffor
191
+ - `compile.js`: compiles regex ast into wasm bytecode
192
+ - `parse.js`: own regex parser
193
+
167
194
  - `test`: contains many test files for majority of supported features
168
195
  - `test262`: test262 runner and utils
169
196
 
@@ -181,8 +208,13 @@ basically nothing will work :). see files in `test` for examples.
181
208
  you can also use deno (`deno run -A ...` instead of `node ...`), or bun (`bun ...` instead of `node ...`)
182
209
 
183
210
  ### flags
184
- - `-raw` for no info logs (just raw js output)
185
- - `-valtype=i32|i64|f64` to set valtype, f64 by default
211
+ - `-target=wasm|c|native` (default: `wasm`) to set target output (native compiles c output to binary, see args below)
212
+ - `-target=c|native` only:
213
+ - `-o=out.c|out.exe|out` to set file to output c or binary
214
+ - `-target=native` only:
215
+ - `-compiler=clang` to set compiler binary (path/name) to use to compile
216
+ - `-cO=O3` to set compiler opt argument
217
+ - `-valtype=i32|i64|f64` (default: `f64`) to set valtype
186
218
  - `-O0` to disable opt
187
219
  - `-O1` (default) to enable basic opt (simplify insts, treeshake wasm imports)
188
220
  - `-O2` to enable advanced opt (inlining)
@@ -190,11 +222,13 @@ you can also use deno (`deno run -A ...` instead of `node ...`), or bun (`bun ..
190
222
  - `-no-run` to not run wasm output, just compile
191
223
  - `-opt-log` to log some opts
192
224
  - `-code-log` to log some codegen (you probably want `-funcs`)
193
- - `-funcs` to log funcs (internal representations)
225
+ - `-regex-log` to log some regex
226
+ - `-funcs` to log funcs
194
227
  - `-opt-funcs` to log funcs after opt
195
228
  - `-sections` to log sections as hex
196
229
  - `-opt-no-inline` to not inline any funcs
197
- - `-tail-call` to enable tail calls (not widely implemented)
230
+ - `-tail-call` to enable tail calls (experimental + not widely implemented)
231
+ - `-compile-hints` to enable V8 compilation hints (experimental + doesn't seem to do much?)
198
232
 
199
233
  ## vscode extension
200
234
  there is a vscode extension in `porffor-for-vscode` which tweaks js syntax highlighting to be nicer with porffor features (eg highlighting wasm inside of inline asm).
package/c ADDED
Binary file
package/c.exe ADDED
Binary file
package/compiler/2c.js ADDED
@@ -0,0 +1,350 @@
1
+ import { read_ieee754_binary64, read_signedLEB128 } from './encoding.js';
2
+ import { Blocktype, Opcodes, Valtype } from './wasmSpec.js';
3
+ import { operatorOpcode } from './expression.js';
4
+
5
+ const CValtype = {
6
+ i8: 'char',
7
+ i16: 'unsigned short', // presume all i16 stuff is unsigned
8
+ i32: 'long',
9
+ i32_u: 'unsigned long',
10
+ i64: 'long long',
11
+ i64_u: 'unsigned long long',
12
+
13
+ f32: 'float',
14
+ f64: 'double',
15
+
16
+ undefined: 'void'
17
+ };
18
+
19
+ const inv = (obj, keyMap = x => x) => Object.keys(obj).reduce((acc, x) => { acc[keyMap(obj[x])] = x; return acc; }, {});
20
+ const invOpcodes = inv(Opcodes);
21
+
22
+ for (const x in CValtype) {
23
+ if (Valtype[x]) CValtype[Valtype[x]] = CValtype[x];
24
+ }
25
+
26
+ const todo = msg => {
27
+ class TodoError extends Error {
28
+ constructor(message) {
29
+ super(message);
30
+ this.name = 'TodoError';
31
+ }
32
+ }
33
+
34
+ throw new TodoError(`todo: ${msg}`);
35
+ };
36
+
37
+ const removeBrackets = str => str.startsWith('(') && str.endsWith(')') ? str.slice(1, -1) : str;
38
+
39
+ export default ({ funcs, globals, tags, exceptions, pages }) => {
40
+ const invOperatorOpcode = Object.values(operatorOpcode).reduce((acc, x) => {
41
+ for (const k in x) {
42
+ acc[x[k]] = k;
43
+ }
44
+ return acc;
45
+ }, {});
46
+ const invGlobals = inv(globals, x => x.idx);
47
+
48
+ const includes = new Map(), unixIncludes = new Map(), winIncludes = new Map();
49
+ let out = '';
50
+
51
+ for (const x in globals) {
52
+ const g = globals[x];
53
+
54
+ out += `${CValtype[g.type]} ${x} = ${g.init ?? 0}`;
55
+ out += ';\n';
56
+ }
57
+
58
+ for (const [ x, p ] of pages) {
59
+ out += `${CValtype[p.type]} ${x.replace(': ', '_').replace(/[^0-9a-zA-Z_]/g, '')}[100]`;
60
+ out += ';\n';
61
+ }
62
+
63
+ if (out) out += '\n';
64
+
65
+ let depth = 1;
66
+ const line = (str, semi = true) => out += `${' '.repeat(depth * 2)}${str}${semi ? ';' : ''}\n`;
67
+ const lines = lines => {
68
+ for (const x of lines) {
69
+ out += `${' '.repeat(depth * 2)}${x}\n`;
70
+ }
71
+ };
72
+
73
+ const platformSpecific = (win, unix, add = true) => {
74
+ let tmp = '';
75
+
76
+ if (win) {
77
+ if (add) out += '#ifdef _WIN32\n';
78
+ else tmp += '#ifdef _WIN32\n';
79
+
80
+ if (add) lines(win.split('\n'));
81
+ else tmp += win + (win.endsWith('\n') ? '' : '\n');
82
+ }
83
+
84
+ if (unix) {
85
+ if (add) out += (win ? '#else' : '#ifndef _WIN32') + '\n';
86
+ else tmp += (win ? '#else' : '#ifndef _WIN32') + '\n';
87
+
88
+ if (add) lines(unix.split('\n'));
89
+ else tmp += unix + (unix.endsWith('\n') ? '' : '\n');
90
+ }
91
+
92
+ if (win || unix)
93
+ if (add) out += '#endif\n';
94
+ else tmp += '#endif\n';
95
+
96
+ return tmp;
97
+ };
98
+
99
+ for (const f of funcs) {
100
+ depth = 1;
101
+
102
+ const invLocals = inv(f.locals, x => x.idx);
103
+ if (f.returns.length > 1) todo('funcs returning >1 value unsupported');
104
+
105
+ const sanitize = str => str.replace(/[^0-9a-zA-Z_]/g, _ => String.fromCharCode(97 + _.charCodeAt(0) % 32));
106
+
107
+ const returns = f.returns.length === 1;
108
+
109
+ const shouldInline = f.internal;
110
+ out += `${f.name === 'main' ? 'int' : CValtype[f.returns[0]]} ${shouldInline ? 'inline ' : ''}${sanitize(f.name)}(${f.params.map((x, i) => `${CValtype[x]} ${invLocals[i]}`).join(', ')}) {\n`;
111
+
112
+ const localKeys = Object.keys(f.locals).sort((a, b) => f.locals[a].idx - f.locals[b].idx).slice(f.params.length).sort((a, b) => f.locals[a].idx - f.locals[b].idx);
113
+ for (const x of localKeys) {
114
+ const l = f.locals[x];
115
+ line(`${CValtype[l.type]} ${x} = 0`);
116
+ }
117
+
118
+ if (localKeys.length !== 0) out += '\n';
119
+
120
+ let vals = [];
121
+ const endNeedsCurly = [], ignoreEnd = [];
122
+ let beginLoop = false, lastCond = false, ifTernary = false;
123
+ for (let _ = 0; _ < f.wasm.length; _++) {
124
+ const i = f.wasm[_];
125
+
126
+ if (invOperatorOpcode[i[0]]) {
127
+ const b = vals.pop();
128
+ const a = vals.pop();
129
+
130
+ let op = invOperatorOpcode[i[0]];
131
+ if (op.length === 3) op = op.slice(0, 2);
132
+
133
+ if (['==', '!=', '>', '>=', '<', '<='].includes(op)) lastCond = true;
134
+ else lastCond = false;
135
+
136
+ // vals.push(`${a} ${op} ${b}`);
137
+ vals.push(`(${removeBrackets(a)} ${op} ${b})`);
138
+ continue;
139
+ }
140
+
141
+ // misc insts
142
+ if (i[0] === 0xfc) {
143
+ switch (i[1]) {
144
+ // i32_trunc_sat_f64_s
145
+ case 0x02:
146
+ vals.push(`(${CValtype.i32})${vals.pop()}`);
147
+ break;
148
+
149
+ // i32_trunc_sat_f64_u
150
+ case 0x03:
151
+ vals.push(`(${CValtype.i32})(${CValtype.i32_u})${vals.pop()}`);
152
+ break;
153
+ }
154
+
155
+ lastCond = false;
156
+ continue;
157
+ }
158
+
159
+ switch (i[0]) {
160
+ case Opcodes.i32_const:
161
+ case Opcodes.i64_const:
162
+ vals.push(read_signedLEB128(i.slice(1)).toString());
163
+ break;
164
+
165
+ case Opcodes.f64_const:
166
+ vals.push(read_ieee754_binary64(i.slice(1)).toExponential());
167
+ break;
168
+
169
+ case Opcodes.local_get:
170
+ vals.push(`${invLocals[i[1]]}`);
171
+ break;
172
+
173
+ case Opcodes.local_set:
174
+ line(`${invLocals[i[1]]} = ${removeBrackets(vals.pop())}`);
175
+ break;
176
+
177
+ case Opcodes.local_tee:
178
+ line(`${invLocals[i[1]]} = ${removeBrackets(vals.pop())}`);
179
+ vals.push(`${invLocals[i[1]]}`);
180
+ // vals.push(`${invLocals[i[1]]} = ${vals.pop()}`);
181
+ break;
182
+
183
+ case Opcodes.global_get:
184
+ vals.push(`${invGlobals[i[1]]}`);
185
+ break;
186
+
187
+ case Opcodes.global_set:
188
+ line(`${invGlobals[i[1]]} = ${removeBrackets(vals.pop())}`);
189
+ break;
190
+
191
+ case Opcodes.f64_trunc:
192
+ // vals.push(`trunc(${vals.pop()})`);
193
+ vals.push(`(int)(${removeBrackets(vals.pop())})`); // this is ~10x faster with clang??
194
+ break;
195
+
196
+ case Opcodes.f64_convert_i32_u:
197
+ case Opcodes.f64_convert_i32_s:
198
+ case Opcodes.f64_convert_i64_u:
199
+ case Opcodes.f64_convert_i64_s:
200
+ // int to double
201
+ vals.push(`(double)${vals.pop()}`);
202
+ break;
203
+
204
+ case Opcodes.return:
205
+ line(`return${returns ? ` ${removeBrackets(vals.pop())}` : ''}`);
206
+ break;
207
+
208
+ case Opcodes.if:
209
+ let cond = removeBrackets(vals.pop());
210
+ if (!lastCond) {
211
+ if (cond.startsWith('(long)')) cond = `${cond.slice(6)} == 1e+0`;
212
+ else cond += ' == 1';
213
+ }
214
+
215
+ ifTernary = i[1] !== Blocktype.void;
216
+ if (ifTernary) {
217
+ ifTernary = cond;
218
+ break;
219
+ }
220
+
221
+ if (beginLoop) {
222
+ beginLoop = false;
223
+ line(`while (${cond}) {`, false);
224
+
225
+ depth++;
226
+ endNeedsCurly.push(true);
227
+ ignoreEnd.push(false, true);
228
+ break;
229
+ }
230
+
231
+ line(`if (${cond}) {`, false);
232
+
233
+ depth++;
234
+ endNeedsCurly.push(true);
235
+ ignoreEnd.push(false);
236
+ break;
237
+
238
+ case Opcodes.else:
239
+ if (ifTernary) break;
240
+
241
+ depth--;
242
+ line(`} else {`, false);
243
+ depth++;
244
+ break;
245
+
246
+ case Opcodes.loop:
247
+ // not doing properly, fake a while loop
248
+ beginLoop = true;
249
+ break;
250
+
251
+ case Opcodes.end:
252
+ if (ignoreEnd.pop()) break;
253
+
254
+ if (ifTernary) {
255
+ const b = vals.pop();
256
+ const a = vals.pop();
257
+ vals.push(`${ifTernary} ? ${a} : ${b}`);
258
+ break;
259
+ }
260
+
261
+ depth--;
262
+ if (endNeedsCurly.pop() === true) line('}', false);
263
+ break;
264
+
265
+ case Opcodes.call:
266
+ let func = funcs.find(x => x.index === i[1]);
267
+ if (!func) {
268
+ const importFunc = importFuncs[i[1]];
269
+ switch (importFunc.name) {
270
+ case 'print':
271
+ line(`printf("%f\\n", ${vals.pop()})`);
272
+ includes.set('stdio.h', true);
273
+ break;
274
+
275
+ case 'time':
276
+ line(`double _time_out`);
277
+ /* platformSpecific(
278
+ `FILETIME _time_filetime;
279
+ GetSystemTimeAsFileTime(&_time_filetime);
280
+
281
+ ULARGE_INTEGER _time_ularge;
282
+ _time_ularge.LowPart = _time_filetime.dwLowDateTime;
283
+ _time_ularge.HighPart = _time_filetime.dwHighDateTime;
284
+ _time_out = (_time_ularge.QuadPart - 116444736000000000i64) / 10000.;`,
285
+ `struct timespec _time;
286
+ clock_gettime(CLOCK_MONOTONIC, &_time);
287
+ _time_out = _time.tv_nsec / 1000000.;`); */
288
+ platformSpecific(
289
+ `LARGE_INTEGER _time_freq, _time_t;
290
+ QueryPerformanceFrequency(&_time_freq);
291
+ QueryPerformanceCounter(&_time_t);
292
+ _time_out = ((double)_time_t.QuadPart / _time_freq.QuadPart) * 1000.;`,
293
+ `struct timespec _time;
294
+ clock_gettime(CLOCK_MONOTONIC, &_time);
295
+ _time_out = _time.tv_nsec / 1000000.;`);
296
+ vals.push(`_time_out`);
297
+
298
+ unixIncludes.set('time.h', true);
299
+ winIncludes.set('windows.h', true);
300
+ break;
301
+
302
+ default:
303
+ log('2c', `unimplemented import: ${importFunc.name}`);
304
+ break;
305
+ }
306
+
307
+ break;
308
+ }
309
+
310
+ let args = [];
311
+ for (let j = 0; j < func.params.length; j++) args.unshift(removeBrackets(vals.pop()));
312
+
313
+ if (func.returns.length === 1) vals.push(`${sanitize(func.name)}(${args.join(', ')})`)
314
+ else line(`${sanitize(func.name)}(${args.join(', ')})`);
315
+
316
+ break;
317
+
318
+ case Opcodes.drop:
319
+ line(vals.pop());
320
+ break;
321
+
322
+ case Opcodes.br:
323
+ // ignore
324
+ // reset "stack"
325
+ vals = [];
326
+ break;
327
+
328
+ default:
329
+ log('2c', `unimplemented op: ${invOpcodes[i[0]]}`);
330
+ // todo(`unimplemented op: ${invOpcodes[i[0]]}`);
331
+ }
332
+
333
+ lastCond = false;
334
+ }
335
+
336
+ if (vals.length === 1 && returns) {
337
+ line(`return ${vals.pop()}`);
338
+ }
339
+
340
+ out += '}\n\n';
341
+ }
342
+
343
+ depth = 0;
344
+
345
+ const makeIncludes = includes => [...includes.keys()].map(x => `#include <${x}>\n`).join('');
346
+
347
+ out = platformSpecific(makeIncludes(winIncludes), makeIncludes(unixIncludes), false) + '\n' + makeIncludes(includes) + '\n' + out;
348
+
349
+ return out;
350
+ };
@@ -26,6 +26,12 @@ export const importedFuncs = [
26
26
  import: 't',
27
27
  params: 0,
28
28
  returns: 1
29
+ },
30
+ {
31
+ name: 'printStr',
32
+ import: 's',
33
+ params: 1,
34
+ returns: 0
29
35
  }
30
36
  ];
31
37
 
@@ -568,7 +574,6 @@ export const BuiltinFuncs = function() {
568
574
  params: [ Valtype.i32 ],
569
575
  locals: [],
570
576
  returns: [ Valtype.v128 ],
571
- memory: true,
572
577
  wasm: [
573
578
  [ Opcodes.local_get, 0 ],
574
579
  [ ...Opcodes.v128_load, 0, 0 ]