@simulatte/webgpu 0.3.0 → 0.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (50) hide show
  1. package/CHANGELOG.md +37 -10
  2. package/LICENSE +191 -0
  3. package/README.md +62 -48
  4. package/api-contract.md +67 -49
  5. package/architecture.md +317 -0
  6. package/assets/package-layers.svg +3 -3
  7. package/docs/doe-api-reference.html +1842 -0
  8. package/doe-api-design.md +237 -0
  9. package/examples/doe-api/README.md +19 -0
  10. package/examples/doe-api/buffers-readback.js +3 -2
  11. package/examples/{doe-routines/compute-once-like-input.js → doe-api/compute-one-shot-like-input.js} +1 -1
  12. package/examples/{doe-routines/compute-once-matmul.js → doe-api/compute-one-shot-matmul.js} +2 -2
  13. package/examples/{doe-routines/compute-once-multiple-inputs.js → doe-api/compute-one-shot-multiple-inputs.js} +1 -1
  14. package/examples/{doe-routines/compute-once.js → doe-api/compute-one-shot.js} +1 -1
  15. package/examples/doe-api/{compile-and-dispatch.js → kernel-create-and-dispatch.js} +4 -6
  16. package/examples/doe-api/{compute-dispatch.js → kernel-run.js} +4 -6
  17. package/headless-webgpu-comparison.md +3 -3
  18. package/jsdoc-style-guide.md +435 -0
  19. package/native/doe_napi.c +1481 -84
  20. package/package.json +18 -6
  21. package/prebuilds/darwin-arm64/doe_napi.node +0 -0
  22. package/prebuilds/darwin-arm64/libwebgpu_doe.dylib +0 -0
  23. package/prebuilds/darwin-arm64/metadata.json +5 -5
  24. package/prebuilds/linux-x64/metadata.json +1 -1
  25. package/scripts/generate-doe-api-docs.js +1607 -0
  26. package/scripts/generate-readme-assets.js +3 -3
  27. package/src/build_metadata.js +7 -4
  28. package/src/bun-ffi.js +1229 -474
  29. package/src/bun.js +5 -1
  30. package/src/compute.d.ts +16 -7
  31. package/src/compute.js +84 -53
  32. package/src/full.d.ts +16 -7
  33. package/src/full.js +12 -10
  34. package/src/index.js +679 -1324
  35. package/src/runtime_cli.js +17 -17
  36. package/src/shared/capabilities.js +144 -0
  37. package/src/shared/compiler-errors.js +78 -0
  38. package/src/shared/encoder-surface.js +295 -0
  39. package/src/shared/full-surface.js +514 -0
  40. package/src/shared/public-surface.js +82 -0
  41. package/src/shared/resource-lifecycle.js +120 -0
  42. package/src/shared/validation.js +495 -0
  43. package/src/webgpu_constants.js +30 -0
  44. package/support-contracts.md +2 -2
  45. package/compat-scope.md +0 -46
  46. package/layering-plan.md +0 -259
  47. package/src/auto_bind_group_layout.js +0 -32
  48. package/src/doe.d.ts +0 -184
  49. package/src/doe.js +0 -641
  50. package/zig-source-inventory.md +0 -468
@@ -0,0 +1,317 @@
1
+ # @simulatte/webgpu architecture
2
+
3
+ This document maps the full runtime stack from Zig native code to the npm
4
+ package surface. It is the single reference for how layers compose.
5
+
6
+ For contract details see the companion docs:
7
+
8
+ - `api-contract.md` — current implemented JS contract, scope and non-goals
9
+ - `doe-api-design.md` — helper naming direction
10
+ - `support-contracts.md` — product scope and support tiers
11
+
12
+ ## Layer diagram
13
+
14
+ ```
15
+ ┌──────────────────────────────────────────────────────────┐
16
+ │ Package exports │
17
+ │ create · requestAdapter · requestDevice · doe · globals │
18
+ │ providerInfo · preflightShaderSource · setupGlobals │
19
+ │ createDoeRuntime · runDawnVsDoeCompare │
20
+ │ 10 functions │
21
+ ├──────────────────────────────────────────────────────────┤
22
+ │ Doe helpers (@simulatte/webgpu-doe) │
23
+ │ doe.requestDevice · doe.bind │
24
+ │ gpu.buffer.create · gpu.buffer.read │
25
+ │ gpu.kernel.run · gpu.kernel.create │
26
+ │ gpu.compute │
27
+ │ 7 methods across 3 namespaces (buffer, kernel, compute) │
28
+ ├──────────────────────────────────────────────────────────┤
29
+ │ WebGPU JS surface (shared/full-surface.js, │
30
+ │ shared/encoder-surface.js) │
31
+ │ DoeGPU · DoeGPUAdapter · DoeGPUDevice │
32
+ │ DoeGPUBuffer · DoeGPUQueue · DoeGPUCommandEncoder │
33
+ │ DoeGPUComputePassEncoder · DoeGPURenderPassEncoder │
34
+ │ DoeGPUTexture · DoeGPUShaderModule · DoeGPUQuerySet │
35
+ │ + 6 trivial resource classes │
36
+ │ ~95 methods across 16 classes │
37
+ │ This layer is WebGPU spec conformance, not Doe API. │
38
+ ├────────────────┬─────────────────────────────────────────┤
39
+ │ N-API addon │ Bun FFI binding │
40
+ │ (Node.js) │ (Bun) │
41
+ │ doe_napi.c │ bun-ffi.js │
42
+ │ 61 functions │ 65 base + 13 Darwin-only = 78 symbols │
43
+ │ │ │
44
+ │ Parallel transports — same JS surface consumes either. │
45
+ │ Not 1:1: N-API has fused batch ops, Bun FFI has flat │
46
+ │ variants and platform-conditional Doe-native symbols. │
47
+ ├────────────────┴─────────────────────────────────────────┤
48
+ │ Zig native ABI (zig/src/doe_*.zig) │
49
+ │ 76 pub export fn with C calling convention │
50
+ │ │
51
+ │ doe_wgpu_native.zig ···· 29 instance/adapter/device/ │
52
+ │ buffer/queue/encoder │
53
+ │ doe_shader_native.zig ·· 11 shader module/pipeline/ │
54
+ │ error reporting │
55
+ │ doe_compute_ext_native .. 7 compute pass ops │
56
+ │ doe_bind_group_native .. 6 bind group/pipeline layout │
57
+ │ doe_render_native.zig ·· 17 texture/sampler/render │
58
+ │ doe_device_caps.zig ···· 4 feature/limits queries │
59
+ │ doe_query_native.zig ··· 4 timestamp queries │
60
+ ├──────────────────────────────────────────────────────────┤
61
+ │ WGSL compiler (zig/src/doe_wgsl/) │
62
+ │ lexer → parser → sema → ir_builder → ir_validate │
63
+ │ → emit_msl_ir / emit_spirv / emit_hlsl / emit_dxil │
64
+ ├──────────────────────────────────────────────────────────┤
65
+ │ Metal / Vulkan / D3D12 backends │
66
+ │ zig/src/backend/{metal,vulkan,d3d12}/ │
67
+ └──────────────────────────────────────────────────────────┘
68
+ ```
69
+
70
+ ## Layer details
71
+
72
+ ### 1. Zig native ABI (76 functions)
73
+
74
+ The bottom of the stack. Every GPU operation is a `pub export fn` with C
75
+ calling convention in `zig/src/doe_*.zig`. These functions directly call Metal,
76
+ Vulkan, or D3D12 backend code.
77
+
78
+ Files and responsibilities:
79
+
80
+ | File | Count | Scope |
81
+ |------|-------|-------|
82
+ | `doe_wgpu_native.zig` | 29 | Instance, adapter, device, buffer, queue, command encoder |
83
+ | `doe_shader_native.zig` | 11 | Shader module creation, compute pipeline, structured error reporting |
84
+ | `doe_compute_ext_native.zig` | 7 | Compute pass: setPipeline, setBindGroup, dispatch, end, getBindGroupLayout |
85
+ | `doe_bind_group_native.zig` | 6 | Bind group layout, bind group, pipeline layout (create + release) |
86
+ | `doe_render_native.zig` | 17 | Texture, texture view, sampler, render pipeline, render pass ops |
87
+ | `doe_device_caps.zig` | 4 | hasFeature, getLimits for adapter and device |
88
+ | `doe_query_native.zig` | 4 | Query set creation, writeTimestamp, resolveQuerySet, destroy |
89
+
90
+ Constants governing the ABI:
91
+
92
+ - `BINDINGS_PER_GROUP = 16` — MSL buffer slot formula: `group * 16 + binding`
93
+ - `MAX_BIND_GROUPS = 4` — maximum bind groups per pipeline
94
+ - `MAX_FLAT_BIND = 64` — flat buffer array size (4 * 16)
95
+
96
+ Lean proofs verify these constants produce collision-free, bounded slot
97
+ mappings (`Fawn.Core.BindGroupSlot`).
98
+
99
+ ### 2. Transport layer (N-API or Bun FFI)
100
+
101
+ Two parallel implementations that bridge Zig native → JavaScript. The JS
102
+ surface classes (layer 3) consume whichever transport is active at runtime.
103
+
104
+ #### N-API addon (Node.js) — 61 functions
105
+
106
+ `native/doe_napi.c` wraps Zig functions via Node-API. Includes fused
107
+ operations not in Bun FFI:
108
+
109
+ - `doe_submit_batched` — batch command buffer submission
110
+ - `doe_submit_compute_dispatch_copy` — fused dispatch + copy
111
+ - `doe_flush_and_map_sync` — fused flush + synchronous map
112
+ - `doe_buffer_assert_mapped_prefix_f32` — assertion helper
113
+
114
+ #### Bun FFI (Bun) — 78 symbols
115
+
116
+ `src/bun-ffi.js` uses `dlopen` to bind C symbols directly. Uses `wgpu*`
117
+ naming for standard WebGPU C API symbols and `doeNative*` for Doe-specific
118
+ functions.
119
+
120
+ Differences from N-API:
121
+
122
+ - Has "flat" variants (`doeRequestAdapterFlat`, `doeBufferMapAsyncFlat`) for
123
+ struct layout compatibility with Bun's FFI
124
+ - 13 Darwin-only symbols added conditionally (error getters, query set,
125
+ queue flush, compute dispatch flush)
126
+ - Does not have N-API's fused batch operations
127
+
128
+ ### 3. WebGPU JS surface (~95 methods, 16 classes)
129
+
130
+ `src/shared/full-surface.js` and `src/shared/encoder-surface.js` implement
131
+ the WebGPU API as JavaScript classes. This is spec-conformant glue, not
132
+ Doe-specific API.
133
+
134
+ | Class | Key methods |
135
+ |-------|-------------|
136
+ | `DoeGPU` | `requestAdapter`, `getPreferredCanvasFormat` |
137
+ | `DoeGPUAdapter` | `requestDevice`, `hasFeature`, `getFeatures`, `getLimits` |
138
+ | `DoeGPUDevice` | 11 `create*` methods, `getQueue`, `hasFeature`, `getLimits`, `destroy` |
139
+ | `DoeGPUBuffer` | `mapAsync`, `getMappedRange`, `unmap`, `destroy` |
140
+ | `DoeGPUQueue` | `submit`, `writeBuffer`, `copy`, `writeTimestamp` |
141
+ | `DoeGPUCommandEncoder` | `beginComputePass`, `beginRenderPass`, 4 copy methods, `finish` |
142
+ | `DoeGPUComputePassEncoder` | `setPipeline`, `setBindGroup`, `dispatchWorkgroups`, `dispatchWorkgroupsIndirect`, `end` |
143
+ | `DoeGPURenderPassEncoder` | `setPipeline`, `draw`, `drawIndexed`, `setVertexBuffer`, `setIndexBuffer`, `end` |
144
+ | `DoeGPUTexture` | `createView`, `destroy` + readonly dimension/format properties |
145
+ | `DoeGPUShaderModule` | `getCompilationInfo` |
146
+ | `DoeGPUComputePipeline` | `getBindGroupLayout` |
147
+ | `DoeGPURenderPipeline` | `getBindGroupLayout` |
148
+ | `DoeGPUQuerySet` | `destroy`, readonly `type`/`count` |
149
+
150
+ Shared helpers in `src/shared/`:
151
+
152
+ - `compiler-errors.js` — WGSL error enrichment with structured fields
153
+ - `validation.js` — input validation utilities
154
+ - `capabilities.js` — device capability detection
155
+ - `resource-lifecycle.js` — buffer/resource lifecycle helpers
156
+
157
+ ### 4. Doe helpers (7 methods, 3 namespaces)
158
+
159
+ `@simulatte/webgpu-doe` provides the Doe-specific compute convenience API
160
+ across `gpu.buffer.*`, `gpu.kernel.*`, and `gpu.compute(...)`.
161
+
162
+ For exact method signatures and behavior, see
163
+ [`api-contract.md`](./api-contract.md) (section `doe`).
164
+ Type declarations: `@simulatte/webgpu-doe/src/index.d.ts`.
165
+
166
+ ### 5. Package exports (10 functions)
167
+
168
+ Entry files: `src/node-runtime.js` (Node.js), `src/bun.js` (Bun),
169
+ `src/full.js` (full surface), `src/compute.js` (compute-only subset).
170
+
171
+ For exact export signatures, see
172
+ [`api-contract.md`](./api-contract.md) (sections `Top-level package API`
173
+ through `CLI contract`).
174
+
175
+ Export paths from `package.json`:
176
+
177
+ ```json
178
+ {
179
+ ".": { "types": "./src/full.d.ts", "bun": "./src/bun.js", "default": "./src/node-runtime.js" },
180
+ "./bun": { "types": "./src/full.d.ts", "default": "./src/bun.js" },
181
+ "./node": { "types": "./src/full.d.ts", "default": "./src/node-runtime.js" },
182
+ "./compute":{ "types": "./src/compute.d.ts", "default": "./src/compute.js" },
183
+ "./full": { "types": "./src/full.d.ts", "default": "./src/full.js" }
184
+ }
185
+ ```
186
+
187
+ ## Data flow
188
+
189
+ A typical compute dispatch flows through the stack:
190
+
191
+ ```
192
+ gpu.compute({ code, inputs, output, workgroups })
193
+ → gpu.kernel.run({ code, bindings, workgroups }) Doe helpers
194
+ → device.createShaderModule(descriptor) JS surface
195
+ → addon.createShaderModule(dev, desc) N-API transport
196
+ → doeNativeDeviceCreateShaderModule(...) Zig native ABI
197
+ → doe_wgsl lexer → parser → sema → IR WGSL compiler
198
+ → emit_msl_ir → Metal compileLibrary Backend
199
+ → device.createComputePipeline(descriptor)
200
+ → encoder.beginComputePass()
201
+ → pass.setPipeline(pipeline)
202
+ → pass.setBindGroup(0, bindGroup)
203
+ → pass.dispatchWorkgroups(x, y, z)
204
+ → pass.end()
205
+ → encoder.finish()
206
+ → queue.submit([commandBuffer])
207
+ → buffer.mapAsync(GPUMapMode.READ)
208
+ → buffer.getMappedRange()
209
+ → return Float32Array(mappedData)
210
+ ```
211
+
212
+ ## Formal verification coverage
213
+
214
+ Lean proofs in `lean/Fawn/Core/` verify properties of the native ABI layer:
215
+
216
+ - **BindGroupSlot** — slot mapping injectivity and bounds (4 theorems)
217
+ - **BufferLifecycle** — state machine idempotency, terminal state, spec gap
218
+ documentation (9 theorems)
219
+ - **Dispatch** — identity actions, scope×command completeness (7 theorems)
220
+ - **Model** — safety class ranking, proof level requirements (2 theorems)
221
+
222
+ Proof artifacts are extracted to `lean/artifacts/proven-conditions.json`
223
+ (40 theorems total across all modules).
224
+
225
+ ## Known spec divergences
226
+
227
+ Formally documented in `Fawn.Core.BufferLifecycle`:
228
+
229
+ | Operation | WebGPU spec | Doe behavior | Reason |
230
+ |-----------|-------------|--------------|--------|
231
+ | `getMappedRange` on unmapped buffer | Validation error | Succeeds (returns UMA pointer) | Apple Silicon unified memory |
232
+ | `dispatch` with mapped buffer | Validation error | Succeeds | No mapped-state precondition check |
233
+ | `buffer.destroy` | Marks unusable | Immediately frees (`doeBufferRelease`) | Simpler lifecycle |
234
+
235
+ ## Core/full runtime split
236
+
237
+ The Zig source is physically split into `core` and `full` subtrees. The JS
238
+ package is a single artifact today; the source boundary enables a future binary
239
+ split.
240
+
241
+ ### Boundary rules
242
+
243
+ 1. `full` composes `core`; it does not toggle `core`.
244
+ 2. `core` must never import `full`.
245
+ 3. `full` may depend on `core` Zig modules, Lean modules, build outputs, and JS helpers.
246
+ 4. Chromium Track A depends on the full runtime artifact and browser-specific gates, not on npm package layout.
247
+
248
+ Anti-bleed:
249
+
250
+ - no `if full_enabled` branches inside `core`
251
+ - no `full` fields added to `core` structs
252
+ - no browser-policy logic added to `full`
253
+
254
+ `full` extends `core` by composition (wrapper types holding core values), never
255
+ by mutating `core` types in place.
256
+
257
+ ### Import fence
258
+
259
+ `zig/src/core/**` may not import any file under `zig/src/full/**`.
260
+ `lean/Fawn/Core/**` may not import any file under `lean/Fawn/Full/**`.
261
+ Any exception requires redesign, not a one-off waiver.
262
+
263
+ CI enforcement for this fence is not yet implemented. It is the highest-priority
264
+ remaining structural artifact.
265
+
266
+ ### Physical layout
267
+
268
+ ```text
269
+ zig/src/core/
270
+ mod.zig (17 public exports)
271
+ abi/ type definitions, loader, proc aliases
272
+ compute/ compute command module
273
+ queue/ queue/sync FFI
274
+ resource/ buffer, texture, copy, resource normalizers
275
+ trace/ tracing
276
+ replay/ replay
277
+
278
+ zig/src/full/
279
+ mod.zig (7 public exports)
280
+ render/ render API, draw loops, samplers, PLS (12 files)
281
+ surface/ FFI surface, surface commands, macOS surface (5 files)
282
+ lifecycle/ async diagnostics
283
+ modules/ rendering services, compute services, resource scheduler
284
+
285
+ lean/Fawn/Core/ BindGroupSlot, Bridge, BufferLifecycle, Dispatch, Model, Runtime
286
+ lean/Fawn/Full/ Comparability, ComparabilityFixtures, WorkloadGeometry
287
+ ```
288
+
289
+ Root compatibility facades (~57 files) remain at `zig/src/` while callers
290
+ retarget. `webgpu_ffi.zig` still owns `WebGPUBackend` and is the load-bearing
291
+ public boundary.
292
+
293
+ ### Coverage split
294
+
295
+ - `config/webgpu-core-coverage.json` — 10 core commands
296
+ - `config/webgpu-full-coverage.json` — 24 commands (core + full)
297
+ - `zig build test-core` and `zig build test-full` exist; split test coverage is thin
298
+
299
+ ### Remaining extraction work
300
+
301
+ 1. Add import-fence CI check (simple path-dependency audit in GitHub Actions)
302
+ 2. Shrink public facade files: `model.zig`, `webgpu_ffi.zig`, `main.zig`, `execution.zig`
303
+ 3. Retire root compatibility facades: `wgpu_commands.zig`, `wgpu_resources.zig`, `wgpu_extended_commands.zig`
304
+ 4. Split backend roots (still own mixed compute/render/surface state): `backend/metal/mod.zig`, `backend/vulkan/mod.zig`, `backend/d3d12/mod.zig`
305
+ 5. Retire legacy unified `config/webgpu-spec-coverage.json`
306
+ 6. Build separate `libwebgpu_doe_core.so` and `libwebgpu_doe_full.so`
307
+
308
+ ### Extraction hotspots
309
+
310
+ Files with the strongest remaining core/full bleed:
311
+
312
+ - `model.zig`, `webgpu_ffi.zig`, `main.zig`, `execution.zig` — mixed public boundary
313
+ - `doe_wgpu_native.zig`, `doe_compute_fast.zig`, `doe_shader_native.zig` — legacy monolithic ABI
314
+ - `backend/metal/mod.zig`, `metal_native_runtime.zig` — mixed backend root
315
+ - `backend/vulkan/mod.zig`, `native_runtime.zig`, `vulkan_runtime_state.zig` — mixed backend root
316
+ - `backend/d3d12/mod.zig` — mixed backend root
317
+ - `backend/backend_iface.zig`, `backend_registry.zig`, `backend_runtime.zig` — mixed command set
@@ -1,7 +1,7 @@
1
1
  <!-- Generated by scripts/generate-readme-assets.js. Do not edit by hand. -->
2
2
  <svg xmlns="http://www.w3.org/2000/svg" width="1200" height="470" viewBox="0 0 1200 470" role="img" aria-labelledby="layers-title layers-desc">
3
3
  <title id="layers-title">@simulatte/webgpu layered package graph</title>
4
- <desc id="layers-desc">Layered package graph showing direct WebGPU, Doe API, and Doe routines over the same package surfaces.</desc>
4
+ <desc id="layers-desc">Layered package graph showing direct WebGPU and Doe API over the same package surfaces.</desc>
5
5
  <defs>
6
6
  <linearGradient id="layers-bg" x1="0%" y1="0%" x2="100%" y2="100%">
7
7
  <stop offset="0%" stop-color="#050816"/>
@@ -46,7 +46,7 @@
46
46
  <rect width="1200" height="470" fill="url(#layers-bg)"/>
47
47
  <rect width="1200" height="470" fill="url(#layers-glow-top)"/>
48
48
  <rect width="1200" height="470" fill="url(#layers-glow-bottom)"/>
49
- <text x="64" y="62" class="title">Same package, four layers</text>
49
+ <text x="64" y="62" class="title">Same package, three layers</text>
50
50
  <text x="64" y="94" class="subtitle">The package surface stays the same while the API gets progressively higher-level.</text>
51
51
 
52
52
  <rect x="170" y="122" width="860" height="64" rx="20" fill="url(#layers-root)" stroke="#c4b5fd" class="box"/>
@@ -59,5 +59,5 @@
59
59
  <text x="600" y="343" text-anchor="middle" class="nodeTitle">Doe API</text>
60
60
 
61
61
  <rect x="360" y="398" width="480" height="52" rx="18" fill="url(#layers-routines)" stroke="#fde68a" class="box"/>
62
- <text x="600" y="431" text-anchor="middle" class="nodeTitle">Doe routines</text>
62
+ <text x="600" y="431" text-anchor="middle" class="nodeTitle">Doe API (`gpu.compute(...)`)</text>
63
63
  </svg>