@simulatte/webgpu-doe 0.1.0 → 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +129 -60
- package/native/doe_napi.c +631 -78
- package/package.json +12 -7
- package/prebuilds/darwin-arm64/doe_napi.node +0 -0
- package/prebuilds/darwin-arm64/libdoe_webgpu.dylib +0 -0
- package/src/index.js +143 -5
package/README.md
CHANGED
|
@@ -1,69 +1,119 @@
|
|
|
1
1
|
# @simulatte/webgpu-doe
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Headless WebGPU for Node.js, powered by the
|
|
4
|
+
[Doe](https://github.com/clocksmith/fawn) runtime.
|
|
4
5
|
|
|
5
|
-
|
|
6
|
-
standard `wgpu*` C ABI. This package wraps it as a Node.js N-API addon, giving
|
|
7
|
-
JavaScript code a WebGPU compute surface without depending on browser runtimes.
|
|
6
|
+
## What this is
|
|
8
7
|
|
|
9
|
-
|
|
8
|
+
A native Metal WebGPU implementation for Node.js — no Dawn, no IPC, no
|
|
9
|
+
11 MB sidecar. Doe compiles WGSL to MSL at runtime via an AST-based
|
|
10
|
+
shader compiler and dispatches directly to Metal via a Zig + ObjC bridge.
|
|
10
11
|
|
|
11
|
-
|
|
12
|
-
and JS wrapper. You must build `libdoe_webgpu` yourself and point the package
|
|
13
|
-
to it. Prebuilt binaries are planned for a future release.
|
|
12
|
+
This package ships:
|
|
14
13
|
|
|
15
|
-
|
|
14
|
+
- **`libdoe_webgpu`** — Doe native runtime (~2 MB, Zig + Metal)
|
|
15
|
+
- **`doe_napi.node`** — N-API addon bridging `libdoe_webgpu` to JavaScript
|
|
16
|
+
- **`src/index.js`** — JS wrapper providing WebGPU-shaped classes and constants
|
|
16
17
|
|
|
17
|
-
|
|
18
|
-
- A C compiler (node-gyp builds the N-API addon on `npm install`)
|
|
19
|
-
- `libdoe_webgpu` shared library (`.dylib` / `.so` / `.dll`)
|
|
20
|
-
- Dawn sidecar library (`libwebgpu_dawn`) — loaded by `libdoe_webgpu` at runtime
|
|
18
|
+
## Architecture
|
|
21
19
|
|
|
22
|
-
|
|
20
|
+
```
|
|
21
|
+
JavaScript (DoeGPUDevice, DoeGPUBuffer, ...)
|
|
22
|
+
|
|
|
23
|
+
N-API addon (doe_napi.c)
|
|
24
|
+
|
|
|
25
|
+
libdoe_webgpu.dylib ← Doe native Metal backend, ~2 MB
|
|
26
|
+
|
|
|
27
|
+
Metal.framework ← GPU execution (Apple Silicon)
|
|
28
|
+
```
|
|
23
29
|
|
|
24
|
-
|
|
30
|
+
No Dawn dependency. All GPU calls go directly from Zig to Metal.
|
|
25
31
|
|
|
26
|
-
|
|
27
|
-
cd fawn/zig
|
|
28
|
-
zig build dropin
|
|
29
|
-
# produces zig-out/lib/libdoe_webgpu.{dylib,so}
|
|
30
|
-
```
|
|
32
|
+
## Performance claims (Metal, Apple Silicon)
|
|
31
33
|
|
|
32
|
-
|
|
33
|
-
`fawn/bench/vendor/dawn/` for build instructions.
|
|
34
|
+
Apples-to-apples vs Dawn (Chromium's WebGPU), matched workloads and timing:
|
|
34
35
|
|
|
35
|
-
|
|
36
|
+
- **Compute e2e** — 1.5x faster (0.23ms vs 0.35ms, 4096 threads)
|
|
37
|
+
- **Buffer upload** — faster across 1 KB to 4 GB (8 sizes claimable)
|
|
38
|
+
- **Atomics** — workgroup atomic and non-atomic both claimable
|
|
39
|
+
- **Matrix-vector multiply** — 3 variants claimable (naive, swizzle, workgroup-shared)
|
|
40
|
+
- **Concurrent execution** — claimable
|
|
41
|
+
- **Zero-init workgroup memory** — claimable
|
|
42
|
+
- **Draw throughput** — 200k draws claimable
|
|
43
|
+
- **Binary size** — ~2 MB vs Dawn's ~11 MB
|
|
36
44
|
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
45
|
+
19 of 30 workloads are claimable. The remaining 11 are bottlenecked by
|
|
46
|
+
per-command Metal command buffer creation overhead (~350us vs Dawn's ~30us).
|
|
47
|
+
See `fawn/bench/` for methodology and raw data.
|
|
40
48
|
|
|
41
|
-
|
|
42
|
-
have a working C toolchain (`xcode-select --install` on macOS, `build-essential`
|
|
43
|
-
on Debian/Ubuntu).
|
|
49
|
+
## API surface
|
|
44
50
|
|
|
45
|
-
|
|
51
|
+
Compute:
|
|
46
52
|
|
|
47
|
-
|
|
53
|
+
- `create()` / `setupGlobals()` / `requestAdapter()` / `requestDevice()`
|
|
54
|
+
- `device.createBuffer()` / `device.createShaderModule()` (WGSL)
|
|
55
|
+
- `device.createComputePipeline()` / `device.createBindGroupLayout()`
|
|
56
|
+
- `device.createBindGroup()` / `device.createPipelineLayout()`
|
|
57
|
+
- `device.createCommandEncoder()` / `encoder.beginComputePass()`
|
|
58
|
+
- `pass.setPipeline()` / `pass.setBindGroup()` / `pass.dispatchWorkgroups()`
|
|
59
|
+
- `pass.dispatchWorkgroupsIndirect()`
|
|
60
|
+
- `pipeline.getBindGroupLayout()`
|
|
61
|
+
- `device.createComputePipelineAsync()`
|
|
62
|
+
- `encoder.copyBufferToBuffer()` / `queue.submit()` / `queue.writeBuffer()`
|
|
63
|
+
- `buffer.mapAsync()` / `buffer.getMappedRange()` / `buffer.unmap()`
|
|
64
|
+
- `queue.onSubmittedWorkDone()`
|
|
48
65
|
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
66
|
+
Render:
|
|
67
|
+
|
|
68
|
+
- `device.createTexture()` / `texture.createView()` / `device.createSampler()`
|
|
69
|
+
- `device.createRenderPipeline()` / `encoder.beginRenderPass()`
|
|
70
|
+
- `renderPass.setPipeline()` / `renderPass.draw()` / `renderPass.end()`
|
|
71
|
+
|
|
72
|
+
Device capabilities:
|
|
73
|
+
|
|
74
|
+
- `device.limits` / `adapter.limits` — full Metal device limits
|
|
75
|
+
- `device.features` / `adapter.features` — reports `shader-f16`
|
|
76
|
+
|
|
77
|
+
Not yet supported: canvas/surface presentation, vertex/index buffer binding
|
|
78
|
+
in render passes, full render pipeline descriptor parsing.
|
|
79
|
+
|
|
80
|
+
## Backend readiness
|
|
81
|
+
|
|
82
|
+
| Backend | Compute | Render | WGSL compiler | Status |
|
|
83
|
+
|---------|---------|--------|---------------|--------|
|
|
84
|
+
| **Metal** (macOS) | Production | Basic (no vertex/index) | WGSL -> MSL (AST-based) | Ready |
|
|
85
|
+
| **Vulkan** (Linux) | WIP | Not started | WGSL -> SPIR-V needed | Experimental |
|
|
86
|
+
| **D3D12** (Windows) | WIP | Not started | WGSL -> HLSL/DXIL needed | Experimental |
|
|
87
|
+
|
|
88
|
+
**Metal** is the primary backend. All Doppler compute workloads run on Metal today:
|
|
89
|
+
bind groups 0-3, buffer map/unmap, indirect dispatch, shader-f16, subgroups,
|
|
90
|
+
override constants, workgroup shared memory, multiple entry points.
|
|
91
|
+
|
|
92
|
+
**Vulkan** and **D3D12** have real native runtime paths (not stubs) with instance
|
|
93
|
+
creation, compute dispatch, and buffer upload — but lack shader translation,
|
|
94
|
+
bind group management, buffer map/unmap, textures, and render pipelines.
|
|
95
|
+
|
|
96
|
+
See [`fawn/status.md`](../../status.md) for the full backend implementation matrix.
|
|
53
97
|
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
98
|
+
## Platform support
|
|
99
|
+
|
|
100
|
+
| Platform | Architecture | Status |
|
|
101
|
+
|----------|-------------|--------|
|
|
102
|
+
| macOS | arm64 | Prebuilt, tested |
|
|
103
|
+
| macOS | x64 | Not yet built |
|
|
104
|
+
| Linux | x64 | Not yet built (Vulkan backend experimental) |
|
|
105
|
+
| Windows | x64 | Not yet built (D3D12 backend experimental) |
|
|
106
|
+
|
|
107
|
+
## Install
|
|
57
108
|
|
|
58
109
|
```sh
|
|
59
|
-
|
|
60
|
-
export DYLD_LIBRARY_PATH=/path/to/dawn/out/Release
|
|
61
|
-
node your-app.js
|
|
110
|
+
npm install @simulatte/webgpu-doe
|
|
62
111
|
```
|
|
63
112
|
|
|
64
|
-
|
|
113
|
+
The N-API addon compiles from C source on install via node-gyp. This requires
|
|
114
|
+
a C compiler (`xcode-select --install` on macOS).
|
|
65
115
|
|
|
66
|
-
|
|
116
|
+
## Usage
|
|
67
117
|
|
|
68
118
|
```js
|
|
69
119
|
import { create, globals } from '@simulatte/webgpu-doe';
|
|
@@ -72,10 +122,26 @@ const gpu = create();
|
|
|
72
122
|
const adapter = await gpu.requestAdapter();
|
|
73
123
|
const device = await adapter.requestDevice();
|
|
74
124
|
|
|
125
|
+
console.log(device.limits.maxComputeWorkgroupSizeX); // 1024
|
|
126
|
+
console.log(device.features.has('shader-f16')); // true
|
|
127
|
+
|
|
128
|
+
// Standard WebGPU compute workflow
|
|
75
129
|
const buffer = device.createBuffer({
|
|
76
130
|
size: 64,
|
|
77
131
|
usage: globals.GPUBufferUsage.STORAGE | globals.GPUBufferUsage.COPY_SRC,
|
|
78
132
|
});
|
|
133
|
+
|
|
134
|
+
const shader = device.createShaderModule({
|
|
135
|
+
code: `
|
|
136
|
+
@group(0) @binding(0) var<storage, read_write> data: array<f32>;
|
|
137
|
+
@compute @workgroup_size(64)
|
|
138
|
+
fn main(@builtin(global_invocation_id) id: vec3u) {
|
|
139
|
+
data[id.x] = data[id.x] * 2.0;
|
|
140
|
+
}
|
|
141
|
+
`,
|
|
142
|
+
});
|
|
143
|
+
|
|
144
|
+
// ... create pipeline, bind group, encode, dispatch, readback
|
|
79
145
|
```
|
|
80
146
|
|
|
81
147
|
### Setup globals (navigator.gpu)
|
|
@@ -84,33 +150,36 @@ const buffer = device.createBuffer({
|
|
|
84
150
|
import { setupGlobals } from '@simulatte/webgpu-doe';
|
|
85
151
|
|
|
86
152
|
setupGlobals(globalThis);
|
|
87
|
-
|
|
88
153
|
const adapter = await navigator.gpu.requestAdapter();
|
|
89
|
-
const device = await adapter.requestDevice();
|
|
90
154
|
```
|
|
91
155
|
|
|
92
|
-
###
|
|
156
|
+
### Provider info
|
|
93
157
|
|
|
94
158
|
```js
|
|
95
|
-
import {
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
const device = await requestDevice();
|
|
159
|
+
import { providerInfo } from '@simulatte/webgpu-doe';
|
|
160
|
+
console.log(providerInfo());
|
|
161
|
+
// { module: '@simulatte/webgpu-doe', loaded: true, doeNative: true, ... }
|
|
99
162
|
```
|
|
100
163
|
|
|
101
|
-
##
|
|
164
|
+
## Configuration
|
|
165
|
+
|
|
166
|
+
The library search order:
|
|
167
|
+
|
|
168
|
+
1. `DOE_WEBGPU_LIB` environment variable (full path)
|
|
169
|
+
2. `<package>/prebuilds/<platform>-<arch>/libdoe_webgpu.{ext}`
|
|
170
|
+
3. `<workspace>/zig/zig-out/lib/libdoe_webgpu.{ext}` (monorepo layout)
|
|
171
|
+
4. `<cwd>/zig/zig-out/lib/libdoe_webgpu.{ext}`
|
|
102
172
|
|
|
103
|
-
|
|
173
|
+
## Building from source
|
|
104
174
|
|
|
105
|
-
|
|
106
|
-
- Buffer create, map, unmap, destroy
|
|
107
|
-
- Shader module creation (WGSL)
|
|
108
|
-
- Compute pipeline, bind group layout, bind group, pipeline layout
|
|
109
|
-
- Command encoder, compute pass (setPipeline, setBindGroup, dispatch, end)
|
|
110
|
-
- Buffer-to-buffer copy
|
|
111
|
-
- Queue submit, queue writeBuffer
|
|
175
|
+
Requires [Zig](https://ziglang.org/download/) (0.15+).
|
|
112
176
|
|
|
113
|
-
|
|
177
|
+
```sh
|
|
178
|
+
git clone https://github.com/clocksmith/fawn
|
|
179
|
+
cd fawn/zig
|
|
180
|
+
zig build dropin
|
|
181
|
+
# Output: zig-out/lib/libdoe_webgpu.{dylib,so}
|
|
182
|
+
```
|
|
114
183
|
|
|
115
184
|
## License
|
|
116
185
|
|