@plasius/gpu-renderer 0.2.5 → 0.2.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -23,25 +23,14 @@ All notable changes to this project will be documented in this file.
23
23
  - **Security**
24
24
  - (placeholder)
25
25
 
26
- ## [0.2.5] - 2026-06-14
27
-
28
- - **Added**
29
- - (placeholder)
30
-
31
- - **Changed**
32
- - (placeholder)
33
-
34
- - **Fixed**
35
- - (placeholder)
36
-
37
- - **Security**
38
- - (placeholder)
39
-
40
- ## [0.2.4] - 2026-06-11
26
+ ## [0.2.7] - 2026-06-15
41
27
 
42
28
  - **Added**
43
29
  - Added `updateCamera(...)` support for wavefront renderers so validation
44
30
  views can animate camera movement without rebuilding mesh buffers.
31
+ - Added `gpuWorkerJobs` diagnostics to wavefront frame stats so callers can
32
+ inspect completed compute-dispatch jobs per frame, per second, and per
33
+ command submission alongside the existing GPU parallelism figures.
45
34
  - Added `gpuParallelism` diagnostics to wavefront frame stats and snapshots so
46
35
  consumers can inspect adapter compute limits, direct workgroups,
47
36
  indirect-dispatch estimates, and whether a frame exposes multi-workgroup GPU
@@ -52,17 +41,76 @@ All notable changes to this project will be documented in this file.
52
41
  - Added default-on `deferredPathResolve` support for wavefront tracing so
53
42
  surface traversal records material responses and terminal source radiance
54
43
  before the output pass resolves the path backward into pixel accumulation.
44
+ - Added design and ADR coverage for shadow-tested direct light and bounded
45
+ bounce attenuation in deferred wavefront path resolution.
46
+ - Added display-quality material-response support for sheen, clearcoat, and
47
+ decoded base-colour, metallic-roughness, normal, and occlusion maps in the
48
+ wavefront mesh path.
49
+ - Added GPU hit-time material atlas sampling for display-quality wavefront
50
+ tracing so exact hit UVs now drive base-colour, metallic-roughness, normal,
51
+ occlusion, and emissive texture evaluation on the GPU instead of relying on
52
+ CPU-baked triangle averages.
53
+ - Added ADR and design coverage for generic glTF material transport so
54
+ specular colour, sheen colour, transmission, clearcoat, and IOR can travel
55
+ through the shared wavefront shading path instead of being inferred from
56
+ demo-specific material names.
57
+ - Added ADR and design coverage for environment-driven glossy surface
58
+ response so specular, sheen, and clearcoat shading can use reflection-
59
+ aligned environment radiance instead of leaning primarily on a sun-only
60
+ highlight proxy.
61
+ - Added ADR and design coverage for prefiltered HDRI, BRDF LUT, and MIS-based
62
+ environment lighting in the wavefront display-quality path.
63
+ - Added ADR/design coverage and public contract support for authored volume
64
+ transport so mesh materials can derive Beer-Lambert media from glTF-style
65
+ attenuation inputs while preserving shell thickness in GPU material
66
+ packing.
67
+ - Added adaptive `renderFrame(...)` sampling controls so callers can cap a
68
+ frame with `frameTimeBudgetMs`, guarantee a `minimumSamplesPerPixel`, and
69
+ inspect actual `renderedSamplesPerPixel` separately from the configured SPP
70
+ ceiling.
71
+ - Added `createWavefrontAdaptiveSamplingLevels(...)` so consumers can hand
72
+ wavefront SPP adaptation to `@plasius/gpu-performance` without duplicating
73
+ renderer-specific ladder construction in app or demo code.
55
74
 
56
75
  - **Changed**
76
+ - Changed internal wavefront frame batching and dispatch-diagnostics plumbing
77
+ to live in a dedicated runtime helper module so scheduling concerns stay
78
+ separated from shader and pipeline assembly.
57
79
  - Changed wavefront frame dispatch to split large tile/sample workloads into
58
80
  bounded command submissions instead of encoding an entire high-resolution
59
81
  frame into one command buffer.
60
- - Changed display-quality wavefront tracing to require one additional
61
- path-vertex storage buffer, raising the trace-stage storage-buffer request
62
- from 9 to 10.
82
+ - Changed display-quality wavefront tracing to pack HDRI importance-sampling
83
+ PDFs and CDFs into one GPU texture so the MIS path stays compatible with
84
+ adapters that expose only the existing 10 trace-stage storage buffers.
63
85
  - Changed wavefront terminal/direct environment estimates to consume
64
86
  `environmentLighting.sunlitBaseline` as a time-of-day daylight floor instead
65
87
  of relying only on restrained ambient colour.
88
+ - Changed deferred wavefront surface resolution to allow explicit
89
+ shadow-tested direct lighting before terminal continuation resolution and to
90
+ remap extremely dark bounce responses to a small scene-brightness floor.
91
+ - Changed wavefront denoise to adapt its kernel strength to SPP and to skip
92
+ the intermediate full-screen scratch pass for 4+ SPP frames, reducing blur
93
+ and denoise cost on cleaner renders.
94
+ - Changed async wavefront frame rendering to wait for submitted GPU work and
95
+ to batch higher-SPP workloads more defensibly, while raising the SPP ceiling
96
+ from 64 to 256 for higher-end GPUs.
97
+ - Changed awaited high-SPP wavefront rendering to fence submitted GPU work
98
+ once per `renderFrame(...)` call instead of after every intermediate command
99
+ submission.
100
+ - Changed wavefront scene, mesh, triangle, material, and hit records to carry
101
+ generic glTF-style specular colour, sheen colour, and transmission inputs
102
+ in addition to the existing roughness, metallic, clearcoat, and IOR data.
103
+ - Changed direct and terminal wavefront surface environment shading to sample
104
+ reflection-aligned environment radiance for glossy materials before adding
105
+ narrower sun highlights, improving leather, chrome, and polished-surface
106
+ response without model-specific rules.
107
+ - Changed wavefront environment-lighting setup to prepare roughness-aware HDRI
108
+ resources, a BRDF integration LUT, and HDRI importance-sampling tables for
109
+ display-quality renders.
110
+ - Changed wavefront continuation sampling to emit BSDF PDFs for diffuse,
111
+ conductor, clearcoat, and transmission paths so environment misses and
112
+ explicit HDRI samples can use MIS instead of the older light-guidance
113
+ heuristics.
66
114
 
67
115
  - **Fixed**
68
116
  - Fixed low-sample wavefront renders so non-emissive surface hits receive a
@@ -70,6 +118,12 @@ All notable changes to this project will be documented in this file.
70
118
  - Reduced deterministic environment fill on direct surface hits so ambient
71
119
  rescue lighting no longer washes dark materials toward the full scene
72
120
  ambient colour.
121
+ - Fixed wavefront output-probe readback so higher-SPP validation renders wait
122
+ for submitted GPU work before mapping the staging buffer, avoiding
123
+ `GPUBuffer.mapAsync(...)` lifetime failures during probe capture.
124
+ - Fixed display-quality environment misses so non-delta BSDF paths now apply
125
+ MIS weighting against the HDRI direction PDF instead of overcounting raw
126
+ sky radiance on termination.
73
127
 
74
128
  - **Security**
75
129
  - (placeholder)
@@ -363,5 +417,4 @@ All notable changes to this project will be documented in this file.
363
417
  [0.1.12]: https://github.com/Plasius-LTD/gpu-renderer/releases/tag/v0.1.12
364
418
  [0.1.14]: https://github.com/Plasius-LTD/gpu-renderer/releases/tag/v0.1.14
365
419
  [0.2.1]: https://github.com/Plasius-LTD/gpu-renderer/releases/tag/v0.2.1
366
- [0.2.4]: https://github.com/Plasius-LTD/gpu-renderer/releases/tag/v0.2.4
367
- [0.2.5]: https://github.com/Plasius-LTD/gpu-renderer/releases/tag/v0.2.5
420
+ [0.2.7]: https://github.com/Plasius-LTD/gpu-renderer/releases/tag/v0.2.7
package/README.md CHANGED
@@ -113,7 +113,10 @@ debug validation scenes. It is compute-driven, tiled, and breadth-first by
113
113
  bounce depth, so queue buffers are bounded by tile size instead of presentation
114
114
  resolution. Renderer-owned GPU record sizes are part of the public compute
115
115
  limits so ray, hit, triangle, BVH, and accumulation buffers stay aligned with
116
- their WGSL layouts.
116
+ their WGSL layouts. Frame submission batching and dispatch diagnostics are kept
117
+ in dedicated runtime helpers so performance-facing integrations can consume
118
+ stable frame stats without inheriting the renderer's shader and pipeline
119
+ assembly internals.
117
120
 
118
121
  ```js
119
122
  import {
@@ -168,7 +171,8 @@ debugRenderer.renderOnce();
168
171
  ```
169
172
 
170
173
  Scene objects currently support analytic `sphere` and axis-aligned `box`
171
- records with colour, emission, roughness, metallic, opacity, and IOR fields.
174
+ records with colour, emission, roughness, metallic, opacity, IOR, clearcoat,
175
+ sheen colour, specular colour, and transmission fields.
172
176
  These records are debug fixtures only. Product Studio visual rendering requires
173
177
  the mesh BVH path described in
174
178
  `docs/adrs/adr-0007-triangle-mesh-wavefront-path-tracing.md`. This is a
@@ -185,6 +189,27 @@ fixtures and deterministic layout tests. GPU BVH construction now uses
185
189
  Morton-style centroid keys to sort leaf references before sorted leaves and
186
190
  level-concurrent internal nodes are materialized. The current mesh path is the
187
191
  GPU runtime baseline under active hardening.
192
+ When mesh inputs also carry UVs plus decoded base-colour,
193
+ metallic-roughness, normal, occlusion, or emissive maps, the display-quality
194
+ path now packs them into GPU texture atlases and samples them at the resolved
195
+ hit UV inside the wavefront trace pass. Generic glTF-style material factors
196
+ such as clearcoat, sheen colour, specular colour, transmission, and IOR are
197
+ also preserved through the GPU records so demo validation does not need
198
+ model-name overrides. CPU-side texture work is limited to load-time decode and
199
+ atlas packing; per-hit shading stays on the GPU. Direct and terminal glossy
200
+ response now also samples reflection-aligned environment radiance so leather,
201
+ chrome, and other polished authored materials can read from the active
202
+ environment map or procedural sky instead of relying mostly on a sun-direction
203
+ proxy.
204
+ Authored participating-media inputs can also enter through the same mesh path.
205
+ A mesh may provide an explicit `medium` block or a glTF-style `material.volume`
206
+ block with `thickness`, `attenuationColor`, and `attenuationDistance`; the
207
+ renderer derives the medium-table entry, carries the active medium id on the
208
+ GPU ray, and applies Beer-Lambert transmittance to travelled segments on the
209
+ GPU. Thickness is now preserved in material packing for later shell-volume
210
+ work, but the current renderer still tracks only one active medium id per ray
211
+ and does not yet implement nested media, spectral dispersion, or a branching
212
+ reflection/refraction tree.
188
213
 
189
214
  `samplesPerPixel` controls how many GPU primary-ray samples are accumulated per
190
215
  screen pixel within a single render. This multiplies dispatch work but does not
@@ -207,9 +232,12 @@ better material PDFs are hardened. By default, `deferredPathResolve` records
207
232
  per-bounce material responses in a tile-bounded path buffer and records the
208
233
  terminal emissive/HDRI/environment source in the final path slot. The output
209
234
  pass then resolves that recorded path backward and adds the weighted sample to
210
- the pixel accumulation, so surface traversal no longer injects broad visible
211
- ambient/direct environment light before a terminal source is known. Set
212
- `deferredPathResolve: false` only for legacy forward-accumulation comparison.
235
+ the pixel accumulation, so unresolved continuation light is still deferred until
236
+ a terminal source is known. Surface resolution may still add a small
237
+ shadow-tested direct-light term immediately when it has an explicit source and
238
+ visibility result, which keeps true occlusion shadows possible without falling
239
+ back to broad per-bounce ambient fill. Set `deferredPathResolve: false` only
240
+ for legacy forward-accumulation comparison.
213
241
  When an `environmentMap` is provided, the wavefront trace shader samples it as
214
242
  an equirectangular radiance source for environment misses and uses the same
215
243
  mapped radiance for terminal residuals before falling back to static ambient.
@@ -217,14 +245,28 @@ The procedural horizon/zenith/sun model remains the fallback for callers that
217
245
  have not supplied an HDRI/radiance texture. `environmentLighting.sunlitBaseline`
218
246
  adds a time-of-day daylight floor to terminal and direct environment estimates,
219
247
  so bright presets retain colour at the last collision without returning to a
220
- whitewashed global ambient term.
248
+ whitewashed global ambient term. Extremely dark recorded bounce responses are
249
+ also remapped to a small scene-brightness-driven luminance floor so bright
250
+ low-sample scenes do not produce isolated black speckles when a valid terminal
251
+ source was already found.
221
252
  For static mesh scenes, the GPU acceleration build is submitted once and then
222
253
  reused by subsequent frames. Per-frame tracing writes one dynamic uniform slot
223
254
  per tile/sample or post-process pass and batches tile tracing, tile output,
224
255
  optional denoise, and presentation into bounded command submissions controlled
225
256
  by `maxFramePassesPerSubmission` to keep 4K/high-spp command buffers from
226
257
  becoming oversized. `updateCamera(...)` can update the per-frame camera uniforms
227
- between renders without rebuilding mesh buffers. Frame stats and snapshots expose
258
+ without rebuilding scene buffers. `renderFrame(...)` also accepts an optional
259
+ `frameTimeBudgetMs` plus `minimumSamplesPerPixel`: when present, configured
260
+ `samplesPerPixel` becomes a ceiling instead of a hard requirement, the renderer
261
+ guarantees at least the minimum full-screen pass, and frame stats report both
262
+ configured `samplesPerPixel` and actual `renderedSamplesPerPixel` so realtime
263
+ callers can budget motion frames without overstating delivered quality.
264
+ For consumers that want to hand wavefront SPP adaptation to
265
+ `@plasius/gpu-performance`, `createWavefrontAdaptiveSamplingLevels(...)` exposes
266
+ a bounded low-to-high ladder of per-frame `samplesPerPixel`,
267
+ `frameTimeBudgetMs`, and `minimumSamplesPerPixel` configs that stay aligned
268
+ with the renderer's supported adaptive-sampling surface. Frame stats and
269
+ snapshots expose
228
270
  `gpuParallelism` diagnostics with adapter compute limits, configured workgroup
229
271
  size, direct compute dispatches, known workgroups/invocations, indirect dispatch
230
272
  counts, and upper-bound indirect work estimates. WebGPU does not expose physical